Communication and Cybernetics 8 



Facts and Models 
in Hearing 

Edited by 

E.Zwicker and E.Terhardt 




Springer-Verlag Berlin Heidelberg NewYork 



Communication and Cybernetics 8 



Editors: W D. Keidel and H. Wolter 




Facts and Models 
in Hearing 



Proceedings of the 

Symposium on Psychophysical Models 
and Physiological Facts in Hearing 

held at Tutzing, Oberbayem, Federal Republic of Germany, 
April 22-26, 1974 



Edited by 

E. Zwicker and E. Terhardt 




Springer-Verlag 

Berlin Heidelberg New York 1974 




Volumes 1 to 7 appeared when the series was called 
Kommunikation und Kybemetik in Einzeldarstellungen 



Professor Dr.-Ing. Eberhard Zwicker 
Dr.-Ing. Ernst Terhardt 

Institut fur Elektroakustik, Technische Universitat Miinchen 



Symposium was sponsored by the Deutsche Forschungsgemeinschaft and the 
Bayerisches Staatsministerium fur Unterricht und Kultus 



With 176 figures 



ISBN-13:978-3-642-65904-l e-ISBN-13:978-3-642-65902-7 

DOI: 10.1007/978-3-642-65902-7 



Library of Congress Cataloging in Publication Data 

Symposium on Psychophysical Models and Physiological Facts in Hearing, Tutzing, 
Ger., 1974. Facts and models in hearing. 

(Communication and cybernetics, v. 8) “Sponsored by the Deutsche Forschungs- 
gemeinschaft and the Bayerisches Staatsministerium fur Unterricht und Kultus.” 
Bibliography: p. 

1. Hearing-Congresses. 2. Ear-Congresses. I. Zwicker, Eberhard, ed. II. Terhardt, E., 
1934 - ed. III. Deutsche Forschungsgemeinschaft (Founded 1949) IV. Bavaria. 
Staatsministerium fur Unterricht und Kultus. V. Title. 

[DNLM: 1. Ear-Physiology-Congresses. 2. Hearing-Congresses. 3. Models, Psycho- 
physical-Congresses. WV272 S9895f 1974] QP460.S95 1974 612’.85 

74-11221 



This work is subject to copyright. All rights are reserved, whether the whole or part of 
the material is concerned, specifically those of translation, reprinting, re-use of 
illustrations, broadcasting, reproduction by photocopying machine or similar means, and 
storage in data banks. Under § 54 of the German Copyright Law, where copies are made 
for other than private use, a fee is payable to the publisher, the amount of the fee to be 
determined by agreement with the publisher. © by Springer-Verlag Berlin Heidelberg 1974. 
Softcover reprint of the hardcover 1st edition 1974 




PREFACE 



During recent years auditory research has advanced quite 
rapidly in the area of experimental psychology as well as 
in that of physiology. Scientists working in both areas have 
in common the study of the process in HEARING, yet different 
scientific areas always tend to diverge. A SYMPOSIUM ON PSY- 
CHOPHYSICAL MODELS AND PHYSIOLOGICAL FACTS IN HEARING was or- 
ganized for the exchange of information and to stimulate dis- 
cussion between research workers in psychoacoustics, neurophy- 
siology, anatomy, morphology and hydromechanics. The basic aim 
of holding this symposium was to halt the divergence and to 
initiate the kind of multi-disciplinary research that will be 
needed to elucidate the hearing process as a whole. The present 
proceedings comprise the papers, which were circulated to the 
participants two months before the symposium and discussed during 
the symposium, together with some comments and additional re- 
marks. These comments and remarks do not, however, represent the 
full discussions but only the parts available in written form. 

We have arranged the material in five sections: 

I. Structure and Neurobiology of the Inner Ear 

II . Cochlear Mechanisms 

III. Auditory Frequency Analysis 

IV. Auditory Time Analysis 

V. Nonlinear Effects 

Within the limits of a symposium, none of these topics could be 
treated comprehensively; moreover, most of the papers concerned 
problems having several aspects . The volume further contains three 
papers which were not presented at the symposium by their authors: 
Keidel*s paper was read by Kallert; the papers of Engstrom and of 
Miller were not presented orally at all. 

Our main endeavor was to meet the demand for rapid publication 
of up-to-date information. In order to realize this goal, we pre- 
scribed certain specifications and restrictions which, of course, 
did not always please the authors; we .apologize for this and thank 
them for their cooperation. 




VI 



The organization of the symposium and the preparation of this 
volume would not have been possible without the help of 
A. Schumann, F. Eberding, H. Fleischer, A. Frei, J. Oelmann, 
H. Schiitte, D. Schultz, W. Suchowerskyj , and several others. 

The symposium as well as the publication of the proceedings 
were sponsored by the Deutsche Forschungsgemeinsohaft and by 
the BayT'Lsohes Staatsm'in'LsteT'Lum fur Unterr'Loht und Kuttus . 

Finally, we acknowledge the good cooperation with Springer- 
Verlag. 



Mai 1974 



Eberhard Zwicker 
Ernst Terhardt 




CONTENTS 



PARTICIPANTS VIII 

I. STRUCTURE AND NEUROBIOLOGY OF THE INNER EAR 1 

H. ENGSTROM Morphology of the walls of the cochlear duct 3 

C. ANGELBORG 

H. SPOENDLIN Neuroanatomy of the cochlea l 8 

R.R. PFEIFFER Comments 

C.E. MOLNAR 

J. R. COX, JR. 

A. FLOCK Neurobiology of hair cells and their synapses 37 

II. COCHLEAR MECHANISMS 43 

V. NEDZELNITSKY Measurements of sound pressure in the cochleae of 45 

anesthetized cats 

P . DALLOS Comments 54 

J.P. WILSON Basilar membrane data and their relation to theories 

of frequency analysis 56 

L. ROBLES Comments 64 

J. TONNDORF The significance of shearing displacements for the 

mechanical stimulation of cochlear hair cells 65 

P. DALLOS Comments 75 

R. HELLE Enlarged hydromechanical cochlea model with basilar 

membrane and tectorial membrane 77 

M. R. SCHROEDER A model for mechanical to neural transduction in 

J.L. HALL the auditory receptor 86 

H. DUIFHUIS Comments 94 

E. ZWICKER A "second filter" established within the scala 

media (General Comment) 95 

H. DUIFHUIS An alternative approach to the second filter 

(General Comment) 100 

III. AUDITORY FREQUENCY ANALYSIS 105 

J.J. ZWISLOCKI Neuro-mechanical frequency analysis in the cochlea IO 7 

W. G. SOKOLICH 

E.F. EVANS Auditory frequency selectivity and the cochlear 

nerve II 8 

J. SCHWARTZKOPFF Comments I 5 O 

E. ZWICKER On a psychoacoustical equivalent of tuning curves 1^2 




VIII 



L.L.M. VOGTEN Pure-tone masking; a new result from a new method 142 

R.J. RITSMA Frequency selectivity and the tonal residue 156 

A. HOEKSTRA 

B. L.CARDOZO Frequency discrimination at the threshold l64 

G.VAN DEN BRINK Monotic and dichotic pitch matchings with 

complex sounds 178 

E. TERHARDT Comments I89 

IV. AUDITORY TIME ANALYSIS 191 

L.U.E. KOHLLOFFEL Recordings from spiral ganglion neurons 193 

E. F. EVANS Comments 204 

G. BOERGER Coding of repetition noise in the cochlear nucleus 

in cat 206 

J.P. WILSON Comments 213 

F. A. BILSEN Comments 2l4 

W.D. KEIDEL Information processing in the higher parts of the 

auditory pathway 216 

A.R. M0LLER Dynamic properties of cochlear nucleus units in 

response to excitory and inhibitory tones 227 

A. VOGEL Roughness and its relation to the time-pattern 

of psychoacoustical excitation 24l 

H. FASTL Transient masking pattern of narrow band maskers 251 

T. HOUTGAST Masking patterns and lateral inhibition 258 

F.A. BILSEN Comments 266 

T. HOUTGAST The slopes of masking patterns (General Comments) 269 

H. FASTL Comments 273 

H. DUIFHUIS A crude quantitative theory of backward masking 275 

V. NONLINEAR EFFECTS 285 

L. ROBLES Nonlinear effects in the transient response of 

W. S. RHODE the basilar membrane 287 

J.P. LEGOUIX Nonlinear mechanisms and cochlear selectivity 299 

M. C. REMOND 

P. DALLOS Cochlear microphonic correlates of cubic 

MARY ANN CHEATHAM difference tones 



312 




R.R. PFEIFFER 
C.E. MOLNAR 
J.R. COX, JR. 



IX 



The representation of tones and combination 
tones in spike discharge patterns of single 
cochlear nerve fibers 325 



G. F. SMOORENBURG 

R. HELLE 

H. FASTL 

E. ZWICKER 

T.J.F. BUUNEN Subjective phase effects and combination tones 5^^ 

F. A. BILSEN 

E. TERHARDT Pitch of pure tones: its relation to intensity 555 



On the mechanisms of combination tone generation 
and lateral inhibition in hearing 552 

Comments 5^5 




PARTICIPANTS 



F. A. Bilsen, Technische Hogeschool Delft, Lab. voor Natuurkunde, Lorentz- 
weg 1, Delft - 8, Netherland 

J. Blauert , Institut fur Elektrische Nachrichtentechnik der RWTH, 51 Aachen, 
Alte Nastrichterstr. 23, West Germany 

G. Boerger , Heinrich-Hertz-Institut fur Schwingungsforschung, 1 Berlin 10, 

Ei nsteinufer 37, West Germany 

G. van den Brink , Faculteit der Geneeskunde, Erasmus Universiteit Rotterdam, 
Lab . f . Bi oTog . a . Medici al Physics, P.O. Box 1738, Rotterdam, Netherland 

B. L. Cardozo , Instituut voor Perceptie Onderzoek, Insulindelaan 2, 

Eindhoven, Netherl and 

P. J. Dallos , Auditory Research Laboratory, Northwestern University, 2299 
Sheridan Road, Evanston, Illinois, 60201, U.S.A,. 

H. Duifhuis , Instituut voor Perceptie Onderzoek, Insulindelaan 2, Eindhoven, 
Netherland 

E.F. Evans , Department of Communication, University of Keele, Keele, 
Staffordshire, STB 5BG, Great Britain 

H. Fasti , Institut fur Elektroakustik der Techn. Universitat Munchen, 

Arcisstr. 21, D-8 Munchen 2, West Germany 

A. Flock , Konung Gustaf V forskningsinstitut, S-104 01 Stockholm 60, 

Sweden 

C. A.A.J. Greebe , Instituut voor Perceptie Onderzoek, Insulindelaan 2, 

Ei ndhoven , Netherl and 

R. Helle , Institut fur Elektroakustik der Techn. Universitat Munchen, 

Arcisstr. 21, D-8 Munchen 2, West Germany 

T. Houtgast , Institute for Perception TNO, Kampweg 5, Postbus 23, Soesterberg, 
Netherland 

Mrs. John , Deutsche Forschungsgemeinschaft, 53 Bonn-Bad Godesberg, 

Kennedyallee 40, West Germany 

H.R. de Jongh , Wilhelmina Gasthuis, Academisch Ziekenhuis bij de Universiteit 
van Amsterdam, ENT- Department, Amsterdam-Oud west, Netherland 

S, Kallert , I. Physiol ogisches Institut der Universitat Erlangen-Niirnberg, 

862 Erlangen, Universitatsstr. 17, West Germany 

L.U.E. Kohll^ffel , I. Physiologisches Institut der Universitat Erlangen- 
Nurnberg , 852 Erlangen, Universitatsstr. 17, West Germany 

J.P. Legouix , Laboratoire de Neurophysiologie General e. College de France, 
ll Place Marcel in-Berthelot, Paris-5°, France 

V. Nedzelnitsky , Eaton- Peabody Laboratory of Auditory Physiology, 42 
Carlton Street, Cambridge, Massachusetts, 02142, U.S.A. 

R.R. Pfeiffer , Department of Electrical Engineering, Washington University, 

St. Louis, Missouri, 63130, U.S.A. 

R. Plomp , Institute for Perception TNO, Kampweg 5, Postbus 23, Soesterberg, 
Netherland 




XI 



R.J. Ritsma, Academisch Ziekenhuis Groningen Kliniek voor Keel-Neus-en 
Oorheelkunde, Oostersingel 59, Groningen, Netherland 

L. Robles , Laboratory of Neurophysiology Medical School, University of 
Wisconsin, 283 Medical Sciences Building, Madison, Wisconsin, 53706, U.S.A. 

J.F. Schouten , Parklaan 36, Eindhoven, Netherland 

M. R. Schroder, Direktor des Dritten Physik. Instituts der Universitat 
Gottingen, Burgerstr. 42-44, West Germany 

J. Schwa rtzkopff . Ins ti tut fur Allgemeine Zoologie der Ruhr-Uni vers i tat 
Bochum, 463 Bochum-Querenburg, Postfach 2148, West Germany 

G. F. Smoorenburg , Institute for Perception TNO, Kampweg 5, Postbus 23, 

Soes terberg , Nether 1 and 

H. Spoendlin , Kantonsspital Zurich, Ramistr. 100 CH-8006 ZUrich, 
Switzerland 

E. Terhardt , Institut fur Elektroakustik der Techn. Universitat Munchen, 
Arcisstr. 21, D-8 Miinchen 2, West Germany 

J. Tonndorf , Department of Otolaryngology, College of Physicians and 
Surgeons of Columbia University, 630 West 168th Street, New York 32, New 
York, U.S.A. 

A. Vogel , Institut fur Elektroakustik der Techn. Universitat MUnchen, 
IrcTsstr. 21, D-8 Munchen 2, West Germany 

L.L.M. Vogten , Instituut voor Perceptie Onderzoek, Insulindelaan 2, 
Eindhoven, Netherl and 

J.P. Wilson , Department of Communication, University of Keele, Keele, 
Staffordshire STS 5BG, Great Britain 

E. Zwicker , Institut fur Elektroakustik der Techn. Universitat Munchen, 
Arcisstr. 21, D-8 Munchen 2, West Germany 

J.J. Zwislocki , Lab. of Sensory Communication, Syracuse University, 821 
University Avenue, Syracuse, New York, 13210, U.S.A. 




I. Structure and Neurobiology of the Inner Ear 




3 



MORPHOLOGY OF THE WALLS OF THE COCHLEAR DUCT 
H. ENGSTROM AND C. ANGELBORG 

Department of Otolaryngology, University of Uppsala, Uppsala, Sweden 

The triangular and helical cochlear duct contains the organ of Corti. It 
is bordered upwards by Reissner's membrane, downwards by the basilar mem- 
brane on which the organ of Corti rests and laterally by the Stria vas- 
cularis. On the lower side of the basilar membrane a layer of mesothelial 
cells of varying thickness forms the tympanic cover layer. Of the three 
walls the Stria vascularis forms a very richly vascularized, presumably 
secretory epithelium. 

The cochlear duct is of ectodermal origin, developed from the otocyst. 

The fluid spaces outside the cochlear duct, the seal a tympani and seal a 
vestibuli are of mesothelial origin and contain perilymph. The length 
of the cochlear duct varies in different species and in man. Bredberg 
(1968) found the average length to be about 34 mm. The perilymphatic 
spaces of the two scalea are at helicotrema in direct contact with 
each other and they thus form a continuous fluid space on both sides of 
the cochlear duct. The basal coil of the cochlea is widest in diameter 
and the coils higher up gradually decrease in size. 

The endolymph of the cochlear duct differs with its high potassiom content 
considerably from the perilymph which instead has a higher sodium concen- 
tration. These two fluids differ also considerably in electrical potentials 
and even higher is the difference between the endolymph and the interior 
of the organ of Corti. Whether the interior of the organ of Corti contains 
a special fluid, corti lymph, as advocated by Engstrom (1960), is still 
obscure. But if this is the case it must to a great extent resemble peri- 
lymph. It is however quite clear that there are several separate fluid- 
containing compartments in the cochlea and the inter-relation between these 
fluids is of great interest from a physiological point of view. These 
problems and a lot of the interesting literature have recently been sur- 
veyed by Vosteen (1970) and by Angelborg (1974). In many of the recent 
studies different forms of tracer substances (Thorotrast ^ , horse-radish 
peroxidase, ferritin) have been used. These substances have been injected 
into perilymph, endolymph or cerebrospinal fluid and their fate inside the 
cochlea has been studied with the aid of the electron microscope. 




4 



Engstrbm and Angel borg: MORPHOLOGY OF COCHLEAR DUCT 




Fig. 1. Modiolar sections through the cochlea of a squirrel monkey. A. 1^ 
coils from base. B. Middle coil. C. Upper basal coll. The arrow indicates 
the tympanic covering layer and DC the cochlear duct. 




5 




Engstrbm and Angel borg: MORPHOLOGY OF COCHLEAR DUCT 




6 



Engstrbm and Angel borg: MORPHOLOGY OF COCHLEAR DUCT 

In the present report the intention is to describe the walls of the 
cochlear duct* As there is a rich literature on the Reissner's mem- 
brane and on Stria vascularis but little is known of the basilar 
membrane and especially of the tympanic cover layer more space will be 
devoted to the latter structure. A special reason to concentrate especial- 
ly on the tympanic cover is that there is every reason to assume that the 
cells of that layer play a more important role than earlier understood. 
MATERIAL AND METHODS 

Since a long time back we have collected an extensive material of inner 
ears for light and electron microscopy. This material contains guinea 
pigs, squirrel monkeys, rhesus monkeys, cats, rabbits, rats and chin- 
chillas. Much of this material has been used for earlier studies. 

For the present study we have used guinea pigs in the portion concerning 
particle transportation in the inner ear and Angelborg (1974) has given a 
detailed description of this material. In this paper all particle micro- 
photographs are from 30 guinea pigs reported on by Angelborg. 

For the studies on the tympanic lamina and the tympanic border cells we 
have used 30 pigmented guinea pigs of both sexes with normal ear drums 
and normal Preyer reflexes. Of these, 5 animals weighing between 100 
and 1100 grams were examined with the scanning microscope (JEOL stereo- 
scan); the rest were studied with the transmission electron microscope. 
FIXATION 

The animals were decapitated and the temporal bones were dissected out. 

The stapes were removed, the oval window opened and the cochlea was per- 
fused with phosphate buffered 2.5% glutaraldehyde or veronal buffered 
1.5% osmic tetroxide. All the cochleas were fixated within 5 minutes 
after sacrificing. 

PREPARATION FOR TRANSMISSION ELECTRON MICROSCOPY 

After that the bone had been taken away in the dissecting microscope, the 
parts from the different coils were dehydrated in increasing concentra- 
tions of ethanol and embedded in Epon 812 according to Luft (1961). The 
specimens were sectioned on a LKB liltrome with a diamond knife. Thin 
sections were stained with uranyl acetate and lead citrate and examined 
in a Siemens Elmiscop 1 A. 

PREPARATION FOR SCANNING ELECTRON MICROSCOPY 



The cochlea was prepared in the same way as for transmission electron 




7 



Engstrbm and Angel borg: MORPHOLOGY OF COCHLEAR DUCT 

microscope. The specimens were dehydrated in alcohol immersed in 
acetone. Placed in an acetone-filled specimen boat they were locked 
in a Polarone Critical Point Drying Apparatus E 3000. There they were 
impregnated at high pressure with liquid CO 2 . The temperature was slowly 
raised to 40® C at a pressure of 1200 p.s.i. (pounds per square-inch) 
and the CO 2 evaporated in gaseous form. The dried specimens were then 
mounted on the specimen stub using doublesided adhesive tape and coated 
first with carbon and then with gold in a Siemens Evaporation Unit, 

VBG 500. The specimens were studied in a Jeol (JSM - Ue) Scanning Elec- 
tron Microscope. 

RESULTS 

As stated earlier the triangular cochlear duct is bordered 'upwards' by 
the Reissner membrane, laterally by the Stria vascularis and 'downwards' 
by the basilar membrane (Fig. 1, 2). These three borders will in the 
following be described separately and special attention shall be devoted 
to the basilar membrane and its mesothelial layer of cells. 

Reissner 's membrane has since long been known to consist of one ecto- 
dermal layer, turned towards the cochlear duct and the endolymph and 
one mesothelial layer turned towards the perilymph. Its thickness varies 
and it is thickest in the region of cell nuclei where it may reach 10 p 
and it thins out in the region between nuclei where it amounts to only 
1-3/1. The two layers differ consiiderably in structure as pointed out 
by several authors (Duvall & Rhodes, 1967; lurato, 1967; lurato & 

Tai dell i ,1967; Hinojosa, 1971). They are separated by a thin basement 
membrane in which occasional fibrous strands can be found. The ecto- 
dermal cells, the cells turned towards the endolymph are flattened 
epithelial cells, thickest in the regions of the nucleus and with small 
microvilli at their surfaces. They have a rich endoplasmatic reticulum 
with rather many ribosomes (see Fig. 3). At the surface numerous 
invaginations can be seen. The cytoplasm contains many forms of coated 
or uncoatdd vesicles, multi-vesicular bodies and lysosomes. Between the 
cells there are numerous well developed cell junctions. 

The perilymphatic cells are of mesothelial origin but resemble otherwise 
in form endothelial cells. They may have a varying thickness in dif- 
ferent portions of the membrane and we have occasionally seen two layers 
of cells and also irregular prolongations from the surface, but no real 




8 



Engstrbm and Angelborg: MORPHOLOGY OF COCHLEAR DUCT 







Endolymph 



Perilymph 



Fig. 3. Reissner’s membrane consisting of two layers of cells. The endo- 
lymphatic side is made of ectodermal cells, the peril5nnphatic one of meso' 
thelial cells. There is a thin basement membrane between the two layers. 
The cells are quite different in structure. There are many microvilla on 
the ectodermal side and also a well developed endoplasmic reticulum. 





9 



Engstrbm and Angel borg: MORPHOLOGY OF COCHLEAR DUCT 

microvilli as on the ectodermal side. They may have numerous ribosomes 
occasionally but they have otherwise a rather simple cytoplasm with 
few cytoplasmic organelles and few mitochondria. Occasional kinocilia 
have been observed both on the mesothelial and ectodermal side. 

Hinojosa (1971) has made a careful study of the transportation of fer- 
ritin across the Reissner membrane. He assumes that ferritin is trans- 
ported across the perilymphatic cell layer predominantly by diffusion 
through intercellular spaces, while the transport in the endothelial 
cells of muscle capillaries as pointed out by Hinjosa. If tracers are 
injected in the endolymph the ferritin is taken up by coated invaginations 
and in other organelles of the endolymphatic cell of the Reissner mem- 
brane. No ferritin is seen in the extracellular spaces. This is in 
good agreement with recent studies by Angelborg (1974) and the conclusion 
is that the transport over the Reissner membrane is mainly from the 
perilymph to the endolymph . 

The basilar membrane forms a fibrous layer below the organ of Corti and 
reaches from the modiolus to the spiral ligament. It consists of one 
Pars tecta reaching from the modiolar attachment to the outer pillar 
cells and one Pars pectinata reaching from the outer pillar to the spiral 

ligament. Claudius (1856) seems to have been the first one to use the 
name Membrana basilaris and he described how it consisted of an inner, 
narrow and less fibri Hated portion and an outer fibrillar region. Many 
later authors have given detailed descriptions of the basilar membrane 
and it has also early been pointed out that the membrane on its lower 
side is covered by a cell -layer of a varying thickness. This layer can 
be seen in our Figures 1 and 2. 

The basilar membrane consists in the Pars tecta of a single fibrous plate, 
which under the outer pillar divides in two separate layers which melt 
together before the membrane laterally inserts in the spiral ligment. On 
the border to the epithelial cells of the organ of Corti there is a thin 
basement membrane. 

In the Pars tecta the different layers are as follows (Fig. 4A): 

1. Plasma membrane of cells in the organ of Corti 

2. Basement membrane 

3. One rather dense fibrous layer 

4. One homogenous layer 

5. Tympanic cover layer 




10 




Engstrbm and Angelborg: MORPHOLOGY OF COCHLEAR DUCT 



Fig. 4. A. Basilar membrane (BM) dividing into two layers under an outer pil' 
lar (OP). The attachment of the pillar to the basilar membrane is seen and 
also the tympanic covering layer. B. Basilar membrane in Pars pectinata con' 
sisting of basement membrane (arrow) jupper fibrous layer(l), upper homo- 
genous layer (2), lower fibrous layer(3) and lower homogenous layer (4). 
Below tympanic covering cells. Some Boettcher cells (BC) are seen above 
the basilar membrane. Lower basal coil, squirrel monkey. 





11 



Engstrbm and Angel borg: MORPHOLOGY OF COCHLEAR DUCT 

In the Pars pectinata the layers are (Fig. 4 B): 

1. Plasma membrane of cells in the organ of Corti 

2. Basement membrane 

3. Upper fibrous layer 

4. Upper homogenous layer 

5. Lower fibrous layer 

6. Lower homogenous layer 

7. Tympanic cover layer 

In the upper homogenous layer occasional cells of a connective tissue 
cell type are seen. They have long oval nuclei and irregular cell 
processes. Angel borg (1974) has observed ki nocilia protruding from 
these cells. 

In the pectinate zone the fibrils are gathered in rather coarse bundles. 

A very good description of the basilar membrane was given by lurato 
(1962) who demonstrated differences in compactness and organisation in 
the various cochlear cells. 

It is also of interest to see that the fibrillar portion of the basilar 
membrane, which usually gets the majority of interest, is a considerably 
smaller portion of the basilar membrane than the inter-fibrillar material, 
the tympanic covering layer and the blood vessels. 

The basilar membrane varies in structure but also in size in the different 
cochlear coils. It is narrowest near the round window and systematically 
wider toward helicotrema. 

The lower side of the basilar membrane is covered by the tympanic lamina, 
the tympanic border cells, the basilar membrane cells or the cells of the 
tympanic covering layer (TCL). We prefer the last name. These cells are 
of mesothelial origin and they form a rather complicated layer of cells. 
Scanning electron microscopy is here excellent and it demonstrates the 
individual cells very nicely (Figures 5-7). The cells are spindle 
shaped with very long cell processes running at right angles to the fiber 
strands in Pars pectinata. That means that they run longitudinally from 
the base of the cochlea in direction toward the top. In the basal coil 
of guinea pigs these cells form a rather flat layer(Fig. 5). The cells 
in the middle coils are more separated but very regular in direction, 
while the cells in the top coil become rather irregular. Each cell is 
provided with a more or less rudimentary kinocilium (Angelborg, 1974) (Fig. 
6). There seems to be a very great species variation also and the squirrel 
monkey has for instance a very complex tympanic covering layer in the 




12 



Engstrbm and Angel borg: MORPHOLOGY OF COCHLEAR DUCT 




Fig. 5. Tympanic covering layer form a guinea pig. A. Basal coil. B. 
Second coil from base. C. Third coil form base. Note the kinocilium in 
A and C (arrows). 




13 



Engstrbm and Angel borg: MORPHOLOGY OF COCHLEAR DUCT 




Fig. 6. A. Tympanic covering layer from the fourth coil of a guinea pig 
cochlea. Irregular arrangement of the cells. B. Higher magnification 
of two cells with kinocilia (arrows). Insert shows nine slightly 
irregular double fibrils. 




14 



Engstrbm and Angel borg: MORPHOLOGY OF COCHLEAR DUCT 




Fig. 7. Tympanic covering layer from guinea pig, basal coil 

basal coil, while the guinea pig has a rather simple arrangement in a 
corresponding coil. 

Rather little interest has been devoted to the tympanic covering layer 
until recently but studies by v. Ilberg and members of the Vosteen group, 
and by Duvall et al , , and by Angel borg, have increased the interest for 
these cells. They are especially of interest from a phagocytic point of 
view and in relation to the interconnection between perilymph and corti- 
lymph through the basilar membrane. Different recent tracer studies have 
given varying results, v. Ilberg (1968) presented results indicating a 
free passage of thorium dioxide, from perilymph to the fluid spaces in the 
organ of Corti, i.e. the corti lymph. When Angel borg reduced the amount of 
tracer or injected through the cerebrospinal fluid he could not find any 
passage from perilymph to cortilymph. To similar results came Duvall and 
Quick (1969). Angelborg has widened his experiments to dead animals where 
he injected tracer and he found a post-mortal transportation of tracers. 
His conclusions are that different modes of injecting the tracer can give 
very different results. In many of the experiments made very large quan- 




15 



Engstrbm and Angel borg: MORPHOLOGY OF COCHLEAR DUCT 




Fig. 8. The basi- 
lar membrane (BM) 
under an outer pil- 
lar (OP). The tym- 
panic cover layer 
is in this middle 
coil of a squirrel 
monkey very thick 
with large nuniDers 
of branches. 



titles of tracer have been used and the results often include extensive 
artifacts. 

Experiments by Duvall and Sutherland (1972) and Jahnke (1972) indicated 
that there is a passage from perilymph to cortilymph and especially with 
horse radish peroxidase the passage seems good. 

The functional importance of the tympanic covering layer from a wave- 
mechanical point of view has been very little discussed. Still it forms , 
as can be seen in our Fig. 8, a very thick layer of longitudinal fibrils 
suspended in perilymph and it could very well have an important functional 
significance as for instance a damping layer. Its different texture in dif- 
ferent cochlear coils indicates a functional difference in these coils. 

Stria vascularis is a richly vascularized membrane along the lateral wall 
of the cochlear duct. Its blood vessels differ in density in the different 





16 



Engstrbm and Angel borg: MORPHOLOGY OF COCHLEAR DUCT 




Fig. 9. Stria vascularis from a guinea pig cochlea (A) and a squirrel mon- 
key (B), A demonstrates the rich network in the basal coil and B dark and 
light cells as well as pigment granules below these cells (dark particles 
to the right ) . 





17 



Engstrbm and Angel borg: MORPHOLOGY OF COCHLEAR DUCT 

cochlear coils and the basal coil is much richer in blood vessels than the 
top coil. This has been recently described by Axelsson (1968) and by Sugar 
et al . (1972). The general form of the vascular network and of the ultra- 
structure of the dark and light cells is seen in Figure 9. 

Stria vascularis has by many authors been compared to a micro-kidney and 
it contains complicated infoldings, large numbers of mitochondria and it 
is rich in enzymes. Below the epithelium and surrounded by extensions from 
the epithelial cells are blood vessels of different sizes. In the subepit- 
helial region there are in non-albino animals large numbers of pigment 
granules. 

The cochlear duct is surrounded by three, from a structural point of view 
completely different, walls, all with their own specific functions. A de- 
tailed knowledge of these walls is necessary for our understanding of 
cochlear function. 

REFERENCES: ANGELBORG, C. (1974). Distribution of macromolecular tracer 
particles (Thororast^) in the cochlea. An electron microscopic study in 
guinea pig. Acta Otolaryng . (Stockholm) Suppl . 319, in press. AXELSSON, A. 
(1968). The vascular anatomy of the cochlea in the guinea pig and in man. 
Acta Otolaryng. (Stockholm) Suppl. 243. BREDBERG, G. (1968). Cellular 
pattern and nerve supply of the human organ of Corti . Acta Otolaryng . 
(Stockholm) Suppl. 236. CLAUDIUS, M.(1856). Bemerkungen liber den Bau der 
hautigen Spiral leiste der Schecke. Zeitschrift fur Wissenschaftliche 
Zoologie 7 , 134. DUVALL III, A.J. & RHODES, V.T. (1967). Ultrastructure 
of the organ of Corti following intermixing of cochlear fluids. Ann, Otol. 
Rhinol Laryngol. 76, 688. DUVALL III, A.J. & QUICK, C.A. (1969). Tracers 
and endogenous debris in delineating cochlear barriers and pathways. Ann. 
Otol. Rhinol. Laryngol. 78, 1041. DUVALL III, A.J. & SUTHERLAND, C.R. (W2). 
Cochlear transport of Horseradish peridase. Ann. Otol, Rhinol, Laryngol. 81 , 
705. ENGSTRUM, H. (1960). The corti lymph, the third lymph of the inner 
ear. Acta MorphoT Neerl. Scand .3, 195. HINOJOSA, R. (1971) Transport of 
ferritin across Reissner's memFrane. Acta Otolaryng. (Stockholm) Suppl. 292, 
5. ILBERG V., CH. (1968). El ektronenmi kroskopi sche Untersuchungen Uber 
Diffusion und Resorption von Thoriumdioxyd an der Meerschweinschenschecke 
4. Mitteilung, Basil armembran und Corti sches Organ. Arch. K1 in. Exp, Ohren 
Nasen Kehlkopfheilkd . 192 , 384. lURATO, S. (1962). Submi croscopi c structure 
of the membranous labyrinth III. The supporting structure of Corti 's organ 
(basilar membrane, limbus spirals and spiral ligament). Z. Zellforsch. 
Mikrosk. Anat . 56 , 40. lURATO, S. (1967). Submicroscopic structure ^ the 
inner ear (ed. S. lurato). Pergamon Press , Oxford. lURATO, S. & TAIDELLI, 
G. (1967). Struttura della membrane di Reissner. Boll. Soc. Hal. Biol. Sper. 43 , 
1657. JAHNKE, K. (1972). Verteilung intrathekal applizierter Peroxydase in 
der Meerschweinshen-Cochlea. Arch. K1 i n. Exp. Ohren Nasen Kehl kopf hei 1 kd. 202 , 
418. LUFT, J.H. (1961). Improvements in epoxy resin embedding methods. J 
Biophys Biochem Cytol 9, 409. SUGAR, J., ENGSTRUM, H. & STAHLE,J. (1972J. 
Stria vascul ari s . In: Inner ear studies. Acta Otolaryng, (Stockholm) Suppl. 
301. VOSTEEN, K.H. (1970). Passive and active transport in the inner ear. 
Arch. Klin. Exp. Ohren Nasen Kehlkopfheilkd. 195, 226. 




18 



NlUROANAfOMY OF THE COCHLEA. 

H. SPOENLLIN 

BNT-Department , University of Zurich, Switzerland 



Three innervation components of the cochlea are known: 

- The first, numerically by far the most important component 
consists of the afferent bipolar cochlear sensory neurons. 

- The second is an efferent nerve supply mainly from the cros- 
sed and uncrossed olivo-cochlear tract coming with the vesti- 
bular nerve to the periphery and reaching to the cochlear 
nerve through the anastomosis of Oort. 

- The third component consists of an autonomic nerve supply 
which originates in the superior cervical ganglion and most 
probably does not enter the organ of Corti. 

All nerve fibres lose their myeline sheeths before they en- 
ter the organ of Corti at the habenula perforata. Within the or- 
gan of Corti the nerve fibres distribute as small unmyelinated 
fibres radially and spirally in a very definite pattern (fig. 1) 



ijmmjfmrm wei 




urnvftimjtta cocftttiy 
of 

[oftgin Qf aciiwwofitioit) 



offftvif racftiio' 

of ftf 

fffWffAni} fJ3ft£S 



rtffw of aartnortm 

iMKVt 



Fig. 1: Schema of 
the 3 innervation 
components with the 
various nerve tracts 
in the organ of 
Corti. iR = inner 
radial fibres, iS = 
inner spiral fibres, 
TS = tunnel spiral 
fibres, TR = tunnel 
radial fibres, B = 
basilar fibres, 

OS = outer spiral 
fibres. 



An important efferent nerve supply of the organ of Corti 




19 



Spoendlin: HEUROAMfOMI OF fHB COCHLEA 

has been demonstrated in many animals (lurato 1962, Kimura and 
Wersall 1962, Spoendlin and G-acek 1963, Smith and Rasmussen 1963). 
It consists of about 500 fibres originating to 3/4 in the contra- 
lateral and to 1/4 in the homolateral superior olivery complex 
(Rasmussen I960). After considerable branching they form the 50- 
200, 0,l-0,2p. thick, inner fibres, all upper tunnel radial fib- 
res and the approximately 40 000 large vesiculated nerve endings 
connected to the outer hair cells in decreasing numbers from base 
to apex (fig. 1 and 2) (Spoendlin 1966a, 1969a & b, 1973 )• 

The efferents of the outer hair cells belong to the contra- 
lateral olivo cochlear fibres 
and degenerate within one or 
two days after transsection of 
the efferent bundle at the 
floor of the fourth ventricel 
or in the vestibular nerve. 

The efferents of the inner hair 
cell system on the other hand 
originate to about 50 ^ in the 
homolateral superior olive and 
take several weeks to degener- 
ate after lesions of the homo- 
and contralateral olivo coch- 
lear fibres (fig. 8)* 

The efferents of the in- 
ner and outer hair cell re- 
gion differ not only in ca- 
liber, distribution pattern 
and degeneration behaviotir 
but also in their synaptic 
connections. At the level of 
the outer hair cells they sy- 
napse almost exclusively with the sensory cell and at the le- 
vel of the inner hair cells predominantly with the afferent 




Fig. 2s Efferent (e) and afferent 
(a) nerve endings at the base of 
a outer hair cell. The efferent 
synapse with the hair cell where 
a subsynaptic cisterna faces the 
synaptic area. Efferents and af- 
ferents are mostly separated by 
supporting cells (S). Only coa- 
ted vesicles (v) and no synap- 
tic bars are seen at the affer- 
ent synaptic sites# 




20 



S p 0 e n dl i n : NIURO ANATOMY OP THE COCHLEA 



dendrites (Spoendlin 1969b) (fig. 2 and 7). 

After elimination of the efferent fibres we find the organ 
of Corti with an exclusively afferent nerve supply. It consists 
of practically all inner radial fibres, the basilar fibres and 
the outer spiral fibres. The only afferent nerve fibres leading 
to the outer hair cell region are the basilar fibres, which cross 
the tunnel at the bottom usually hidden in invagiations of the 
supporting cells (fig. 1). Their total number is surprisingly 
small with 2500 to 3000 fibres in one cochlea which represents 
only about 5% of the total number of afferent cochlear neurons 
(Spoendlin, 1966b). 95^ of the afferent cochlear neurons are con- 
nected to the inner hair cell system (Spoendlin, 1969a). In nor- 
mal animals 15^ of all efferent and afferent nerve fibres cross 



the tunnel and all the others remain at the level of the inner hair 
cells. Only about 1/3 of all tunnel crossing fibres are afferent 
(basilar fibres) (Spoendlin 1973)* The maximum innervation den- 
sity is in the upper basal turn (fig. 3) (Spoendlin 1972a). The 



f?c/mber ofneri/e 
fibres per 200^(4 




Hook hieer midif/e upper M JST 



association of the major- 
ity of afferent neurons to 
the inner hair cells is 
also illustrated by the 
fact that substantial re- 
trograde degeneration 
starts only after destruc- 
tion of the inner hair 
cells (Spoendlin 1973). 

Fig. 3s Nerve fibre densi- 
ties in different cochle- 
ar turns at the level of 
the habenula (1) and at 
the level of the tunnel 
(2) including all fibres 
(afferents and efferents). 




21 



Spoendlin: NEUROAMTOMT OP THE COCHLEA 

Each inner hair cell is innervated by about 20 afferent un- 
branched strictly radial neurons (fig. 8) whose small endings us- 
ually form one typical synapse with the hair cell (fig. 7). The 
afferents to the outer hair cells on the other hand take a long 
spiral course as outer spiral fibres between the Deiter cells be- 
fore they send their terminal collaterals to the hair cells. On 
the basis of the fact that all outer spiral fibres are the conti- 
nuation of the basilar fibres, that there is an average of one 
basilar fibre per outer pillar and that we find about 100 outer 
spiral fibres at any given place we can conclude that the outer 
spiral fibres extend spirally over an average distance of 100 
pillars corresponding to 0.6 - 0.7mm (Spoendlin 1968). In the last 
200 Ji of its course each fibre sends collaterals to about 10 outer 
hair cells and each hair cell in the basal turn is provided with 
about 4 afferent endings of different neurons according to the 
principle of multiple innervation (Spoendlin 1969b). Just recent- 
ly a similar extention and arrangement of outer spiral fibres has 
been confirmed by Smith (1972) in the rat by direct visualization 
of single fibres in histochemical preparations. 

The afferent neurons of the inner and outer hair cells are 
not only distinguished by their distribution pattern but also by 
structural and metabolic differences and on the basis of their en- 
tirely different degeneration behaviour. 

Typical synaptic complexes with synaptic bars are usually 
found between afferent terminals and inner hair cells (fig. 7.) but 
they are missing in the afferent nerve connections of the outer 
hair cells of the cat (fig. 2). The outer spiral fibres always 
contain a great number of neurocanaliculi but no neurofilaments 
in their axoplasme whereas the neurofilaments are predominant in 
the inner radial fibres. Metabolically^the afferent dendrites to 
the inner hair cells are very susceptible to h3rpoxia, demonstra- 
ting enormous swellings after even short periods of hypoxia where- 
as the afferent fibres for the outer hair cells remain unchanged 
(Spoendlin 1974). 




22 



Spoendlin: NEUROANATOMY OF THE COCHLEA 

The two types of afferent neurons exhibit an entirely different 
degeneration behaviour (Spoendlin 1965, 1971). The great number 
of radial neurons of the inner hair cell system undergoes a com- 
plete secondary degeneration after transsection of the cochlear 
nerve in the inner acoustic meatus whereas the afferent neurons 
to the outer hair cells remain in normal numbers and appearence 
even after surviving times of more than one year. 



At the level of the spiral ganglion most ganglion cells will 
degenerate and normally disappear within four months after trans- 
section of the cochlear nerve in the cat. However there are al- 
ways 5-8^ ganglion cells remaining scattered regularly through- 
out Rosenthal’s canal in all turns of the cochlea (Spoendlin 1971). 

The great majority of spiral ganglion cells in the cat (ty- 
pe I) is large, myelinated with a central, round nucleus, light 
chromatin and a very outstanding nucleolus. The cytoplasms con- 
tains many ribosoms and practically no filaments (fig. 4). All 

Fig. 4s Schematic 
representation of 
the 3 types of spi- 
ral ganglion cells 
with their respec- 
tive percentages^ 
Since the type III 
ganglion cells are 
not foiind in normal 
spiral ganglia it 
is assumed that they 
derive from the type 
I ganglion cells un- 
der the influence 
of secondary degene- 
rative changes. The 
axons of typ I and 
III are myelinated 
whereas most axons of the monopolar typ II cells remain "unmyeli- 
nated. 

these ganglion cells are susceptible to secondary degeneration af- 
ter section of the cochlear nerve in the internal acoustic meatus. 

About half of the surviving ganglion cells (Typ II), usual- 
ly unmyelinated with an excentric lobulated nucleus with dense 





23 



Spoendlin: NIUROANATOMI OP THE COCHLEA 

chromatin and a not very pronounced nucleolus. Their cytoplasms 
contains only a few rihosoms but a great number of filaments 
(fig. 4 ^ Spoendlin 1971, 1972a). The other half of the surviv- 
ing ganglion cells (type III), resembles very much the ordinary 
"t^ype I ganglion cell with the only exception that they have no 
myeline sheeth (>fig.4,5)«Whereas typel and II are always found 

in normal animals, the type 
II cells are only seen af- 
ter secondary degeneration 
has occured, which indica- 
tes that they are derived 
from type I cells, modi- 
fied loy, but still resis- 
ting retrograde degenera- 
tion (fig. 4) (Spoendlin 
1973). 

Secondary retrograde 
degeneration will also oc- 
cur after destruction of 
the organ of Corti by oto- 
toxic antibiotics or acou- 
stic trauma, when the inner 
hair cells are destroied. 
This type of retrograde de- 

Pig. 5s Reconstruction of portions generation also affects 
of the spiral ganglion in the first . ^ 

and second turn of a cat after se- mainly the common type I 

condary degeneration following myelinated ganglion cells 

transsection of the cochlear nerve. 

The surviving ganglion consists of unmyelinat- 

ahout equal parts of type I and ed type II and III tend to 

type IIIo 

survive . 

The type III ganglion cells are bipolar and their axons 
become myelinated after a certain distance from the cell body 
and can be followed through the osseous spiral lamina and into 
the modiolus ♦ 



Cl, 150^4 






24 



Spoendlin: NBUROAMTOMI OF THE COCHLEA 

Most of the type II cells however are monopolar and their 
axons seem to remain xinmyelinated throughout their entire cour- 
se. These unmyelinated axons are of course rather difficult to 
follow over long distances, but since we find always a certain 
number of unmyelinated axons at every level of the osseous spi- 
ral lamina in normal and operated animals it is most likely that 
the unmyelinated axons of the type II cells also reach the organ 
of Corti. In normal animals we were never able to find morpholo- 
gical evidence of synaptic contacts between the various neurons 
within the spiral ganglion and osseous spiral lamina of the cat# 

The remaining afferent i nerves to the outer hair cells in 
animals, where retrograde degeneration has occured after section 
of the VIII. nerve, belong obviously to the surviving ganglion 
cells (type II or III). In such animals^ however^ also another ty- 
pe of very large nerve fibres is found at the level of the inner 
hair cells every 5th to 10th habenular opening. They expand in a 
spiral direction connecting about 10 inner hair cells presenting 
frequently afferent synapses with the hair cells (fig. 6, 8^ 
Spoendlin 1971, 1972a). However^ in normal control animals such 
giant fibres are not found even after careful examination. They 
appear obviously only after secondary degeneration has destroyed 



Fig. 6: Schematic repre- 
sentation of the types 
of nerve fibres found in 
the habenular region se- 
veral months after trans- 
section of the VIII. nerve. 
OF: afferent neurons to 
the outer hair cells, 

OF: giant fibre associat- 
ed with the inner hair 
cells, HF: fibre ending 
in the habenula with ve- 
sicle filled enlarge- 
ments, RF: returning fi- 
bre. 






25 



Spoendlin: NEUROMATOMY OF TEE COCHLEA 

most cochlear neurons (type I) and has induced some morphologi- 
cal changes in others (type III). Most evidence indicates how- 
ever that they represent not just nexirons in the course of de- 
generation hut that they belong to viable lasting neurons, since 
we found them after all survival times from 4 to 13 months and 
they exhibit frequent synapses with the inner hair cells. V/het- 
her they are present in an other smaller form in normal animals 
and what significance they have remains to be elucidated* 

Careful reconstructions of the habenular region in operat- 
ed animals provided evidence that the giant neurons of the inner 
hair cell system are coming from myelinated fibres in the osse- 
ous spiral lamina (fig. 6). Most afferent fibres of the outer 
hair cells seem to originate in myelinated fibres as well, but 
the difficulties to follow their course backwards through the 
haben\xla are considerable and the resiilts not yet conclusive* In 
addition to these fairly well defined remaining neurons other 
types were occasionally found. One of these is fairly large, se- 
ems to end with tortuous varicous enlargements in the habeniilar 
opening itself and is unmyelinated, whereas the other is rather 
small, turns back before the habenula and might be myelinated or 
unmyelinated. The significance of these last two fibre types re- 
mains however unclear and for the time being we have to content 
ourselves to mention their existence (fig. 6). 

With the information obtained in these degeneration expe- 
riments it is possible to recognize also in normal animals the 
afferent fibres to the outer hair cells already within the habe- 
nula, to distinguish them from the afferents of the inner hair 
cell system and to trace their exact course. One or two of them 
cross each habeniilar openings in a most distal position to take 
a spiral basalward course over the distance of several inner pil- 
lars before they penetrate between the inner pillars to form the 
basilar fibres (fig. 7, 8). Within the habenula all fibres are 
separated from each other by a special single satellite cell, 
which surrounds all fibres. Immediately after the habenula the 




26 



Spoendlin: NEUROANATOMI OF THE COCHLEA 




Pig. 7: Habenular region of a normal cat, where all nerve fibres 
lose their individual swann cells (sw) and myeline sheeth (M) 
and penetrate the habenula (HA) as very small fibres, individual- 
ly surrounded by a common special satellite cell (S). As soon as 
they have entered the organ of Corti they become thicker again 
and run without any sheeth as a closely packed bundle adjacent 
to each other. Most of these fibres are afferent dendrites to 
the inner hair cell (D). The afferent fibres to the outer hair 
cells (B) take a separate course immediatly after the habentila, 
showing practically no direct contacts with the dendrites of the 
inner hair cell system and continue as basilar fibres through 
the tunnel. The efferent tunnel radial fibres (R) emerge from the 
same nerve bundle from the habenula and take a direct radial 
course through the tunnel. The efferent inner spiral fibres (i) 
synapse almost exclusively with afferent dendrites to the inner 
hair cells (H). 

fibres of one habenular fascicle lie in immediate contact to 
each other, with no more than the usual intercellular gaps of 
about 200 2 between the axon membranes of adjacent fibres (fig. 7). 
These intimate contacts are especially extensive between the 




27 



Spoendlin: NEUROANATOMY OP THE COCHLEA 

fibres to the inner hair cells, whereas the fibres of the outer 
hair cell system take soon after the habenula a separate course 
and have only very short or no direct contacts with the fibres 
of the inner hair cell system (fig. 7). Morphological evidence of 
synapses between these fibres has never been found. Any direct 
functional interation between the fibres therefore could, if at 
all, only occur on an electrical basis, predominantly between 
fibres of the inner hair cell system. Whether the special satel- 
lite cells in the habenula could possibly play an intermediate 
role in functional interaction between the nerve fibres remains 
an open question. How much the efferent system may mediate a pos- 
sible interaction between the inner and outer hair cell system 
depends on type and site of the synaptic contacts between the 
systems. 

Sxxmmarizing all these findings the following conclusions are 
allowed (fig. 8)s The afferent and efferent nerve distribution on 
inner and outer hair cells appears to be reciprocal: The affer- 
ent dendrites exhibit an almost exclusively spiral distribution 
on the outer hair cells and a clearly radial distribution on the 
inner hair cells. The efferent terminal branches however pres- 
ent a predominantly radial distribution with a relatively limit- 
ed spiral extension at the level of the outer hair cells and a 
spiral distribution in the inner spiral plexus. The numerical 
ratio of innervation density of inner and outer hair cells is 
about 20 to 1. Each inner hair cell is innervated by about 20 in- 
dependent afferent neurons whereas at the level of the outer hair 
cells each afferent neuron is connected to about 10 outer hair 
cells. In other words^we are dealing with a highly divergent in- 
nervation modus at the inner hair cell system and a highly con- 
vergent innervation modus at the outer hair cell system. The 
afferent neurons of the inner and outer hair cell system are not 
only distinguished by their distribution pattern but also by 
structural and metabolic differences and on the basis of their 




28 



Spoendlin: NEUROANATOMY OF THE COCHLEA 

entirely different degeneration behaviour. The afferents to in- 
ner and outer hair cells appear to be two essentially separate 
systems without morphological evidence of substantial interaction 
within the organ of Corti, the osseous spiral lamina or the spi- 
ral ganglion. Also the efferents of the outer and inner hair 
cell system differ clearly in their morphology, the synaptic con- 
nections and their degeneration behaviour. 




Pig. 8: Horizontal innervation schema of the organ of Corti with 
the different types of afferent neurons on left and of efferent 
neurons on the right with their corresponding approximate percent- 
ages* 



The possible functional implications of these innervation 
patterns have already been discussed on different occasions 
(Spoendlin 1966, 69, 70, 71, 72, 73). The very pronounced diver- 
gence of afferent innervation of the inner hair cells, where one 
hair cell is connected to about 20 single neurons and the great 





29 



Spoendlin: NEUEOANAfOMT OP THE COCHLEA 

convergence of the afferent innervation of the outer hair cell, 
where about 10 hair cells are connected to the branches of one 
neuron indicates of course a basically different functional beha- 
viour of the two systems. Such a structural organization gives 
the outer hair cell system the possibility of spatial summation 
which is not possible for single neurons of the inner hair cell 
system. The direct consequence of this difference would be a hig- 
her sensitivity of the outer hair cell system, which in fact has 
been postulated by Stange et al. 1971 on the basis of their elec- 
tro physiological results. The results obtained in electro-cochleo- 
graphy would also fit in such a concept (Aran 1971, Portmann 1972) 
and would go along with the ass\imption that recruitment is the 
consequence of a non-functioning outer hair cell system. 

In electro physiological studies Kiang (1965) was however ne- 
ver able to find two different types of cochlear neurons, one of 
which would have a considerably lower threshold than the other* 

In this connection an interesting study of Altmann (1972) sho\ild 
be mentioned, who on the basis of the known physiological mecha- 
nisms of membranes, synapses and spike initiation and on the assum- 
ption that the acoustic receptor consists only of a simple inner 
hair cell system, elaborated a mathematical model of the coding 
mechanism. By simulating in this mathematical system the same ex- 
perimental conditions as used by Kiang (1965) in his experiments 
on the discharge pattern of the primary auditory neurons, he ob- 
tained exactly the same spike-histogramms as fo\md by Kiang. This 
means that Kiang in his experiments most probably recorded essen- 
tially from neurons associated with the inner hair cells and that 
his results reflect mainly the coding mechanisms of the inner 
hair cell-system. 

Other concepts on the outer hair cell function have been 
brought forward such as a monitoring action on the inner hair 
cell system (Lynn and Sayer 1970). This however would necessitate 
the possibility of interaction between the inner and outer hair 
cell system, for which, as we have seen, exists practically no mor- 




30 



Spoendlin: NEUROANATOMT OF THE COCHLEA 

phological substrate. It seems rather that the inner and outer 
hair cells system are anatomically two essentially separate sy- 
stems with only very limited possibilities of interaction of 
their first order peripheral neiarons. The first place where sub- 
stantial interactions between the two systems would anatomically 
be conceivable are the cochlear nuclei, if not extensive electri- 
cal interference between the different nerve fibres in the organ 
of Corti is taking place, which probably is not very likely. 

Although clear evidence for an inhibitory action of the ef- 
ferents on the afferent nerve activity exists (Desmedt, 1962; 

Fex, 1962; Wiederhold et al. 1970) no satisfactory explanation 
of the main function of the efferents in the cochlear receptor 
has been found, despite a number of observations and postulations 
on the functional role of the efferents, such as an effect on adr* 
aptation (Leibbrand 1965) and masking (Trahiotis et al. 1970), 
stabilisation of threshold (Johnstone 1968) or prevention of wa- 
stage of chemical transmitter in the outer hair cells (David 1968), 
The basic difference of the efferents to the outer and inner hair 
cells most probably expresses also a different functional signi- 
ficance. The enormous, in comparison to the afferent nerve supply 
appearingly overdimensioned, efferent nerve supply to the outer 
hair cells would go along with a concept of a more monitoring 
role of the outer hair cell system. Whatever the role of the ef- 
ferents might be it is bound to be important in the view of the 
enormous representation of efferent fibres and endings in the or- 
gan of Corti. 

As B^kesy already has indicated in his pioneer work of the 
cochlear fiinction there is no question that the complex but ap- 
pearingly -very definite innervation pattern of the cochlear re- 
ceptor with different types of neurons is a most important fac- 
tor in the processing of acoustic information in the auditory 
system© 




31 



Spoendlin: NEUROANATOMY OF THE COCHLEA 

RIPEEBNCBS 

Altmann, A. (1972): Modellierung von Nervenfunktionen bei spe- 
zieller Anwendung auf den primaren Hdrnerv. Thesis ETH, 

(Nr. 5032). 

Aran, J.-M. (1972): L’Electro-Cochleogramme. Les Cahiers de la 
C.F.A. 14: 101-128. 

Bekesy, 0. von (I960): Experiments in Hearing (Me G-raw-Hill, 

New York). 

Davis, H. (1968): Contribution to discussion, in: Hearing Mech- 
anisms in Vertebrates, A.V.S. de Reuck & J. Knight, Eds. 
(Churchill, London) pp. 119, 305. 

Desmedt, J.E. (1962): Auditory-evoked potentials from cochlea to 
cortex as influenced by activation of the efferent olivo- 
cochlear bundle. J. Acoust .Soc.Amer. 34: 1478-1496. 

Engstrbm, H. and Wersall, J. (1958): The ultrastructural Organi- 
zation of the organ of Corti. Exp. Cell. Res. Suppl. 5* 

Pex, J. (1962): Auditory activity in centrifugal and centripetal 
cochlear fibres in cat. Acta Physiol. Scand. 55: Suppl. 189, 
5 - 68 . 

lurato, S. (1962): Efferent fibres to the sensory cells of Corti’s 
organ. Exp. Cell. Res. 27: 162. 

Kimura, R. and Wersall, J. (1962): Termination of the olivo-coch- 
lear bundle in relation to the outer hair cells of the organ 
of Corti in guinea pig. Acta Oto-Laryngol. 55: 11-32. 

Leibbrandt, C.C.(1965): fhe significance of the olivo-cochlear 
bundle for the adaptation mechanism of the inner ear. Acta 
Oto-Laryngol. 59: 124-132 « 

Lynn, P.A. , Sayers, B.McA. (1970) : Cochlear innervation, signal 
processing and their relation to auditory time intensity 
effects. J. Acoust. Soc. Amer. 47: 525-532. 

Portmann, M.(1972): Discussion to H. Spoendlin, in: Innervation 

densities of the cochlea. Acta 0to-Rhino-Laryng« 73: 235-248# 

Rasmussen, C.L. (I960): Efferent fibres of the cochlear nerve and 
cochlear nucleus, in: Neural Mechanisms of the Auditory and 
Vestibular Systems, C.L. Rasmussen and W.P.Windle, Eds. 
(Charles C. Thomas, Springfield, 111), pp. 105-115# 

Smith, C.A. and Rasmussen, Gr.L.(1963): Recent observations on the 
olivo-cochlear bundle, Ann.Otol.Rhinol.Laryngol. 72: 489# 

Smith, C.A. (1972): Preliminary observations on the terminal ra- 
mifications of nerve fibres in the cochlea. Acta Oto-Larng. 

Spoendlin, H. and Cacek,R.R. (1963) : Electronmicroscopic study of 
the efferent and afferent innervation of the organ of Corti 
in the Cat. Ann. Otol.Rhinol.Laryng. 72: No. 3 1-27# 




32 



Spoendl in: NEUROMATOMY OP THE COCHLEA 



Spoendlin, H. and Gacek, R.R.(1965): Survival of the peripheral 
dendrites after section of the cochlear nerve, in:Proceed- 
ings of the Yth. Int.Gongr, of Neuropathology. Reprinted 
from Excerpta Medica Int.Congr. Ser. No. 100, 926-934® 

Spoendlin, H. and Lichtensteiger , W. (1966a): The adrenergic in- 
nervation of the labyrinth. Acta '-’to-f»aryng. 61: 423-434. 

Spoendlin, H. (1966b): The Organization of the Cochlear Recept- 
or. (Karger, Basel -New York)* 

Spoendlin, H. (1967b): The innervation of the organ of Corti# 

J. Laryngol. Otol. 81: 717-738 • 

Spoendlin, H. (1969a): Innervation patterns in the organ of Corti 
of the cat. Acta Oto-Laryng. 67: 239-254# 

Spoendlin, H. (1969b): Struct, basis of periph. frequency analys- 
is, in: Frequency analysis & periodicity detection in hear- 
ing. Eds. R.Plomb & G.P.Smoorenburg, Sijthoff, Leiden 1970® 

Spoendlin, H. (1971): Degeneration behaviour of the cochlear ner- 
ve. Arch. klin. exp. Ohr.-, Nas.- u.Kehlk.Heilk. 200:275-291® 

Spoendlin, H. (1972a): Innervation densities of the cochlea. 

Acta Otolaryng* 73: 235-248« 

Spoendlin, H. (1972b): Autonomic nerve supply to the inner ear® 
in: ’’Vascular disorders and hearing defects”, de Lorenzo, 
Ed.(Univ. Park Press, Baltimore). 

Spoendlin, H. Brun, J.P. (1972c): Relation of structural damage 
to exposure time and intensity in acoustic trauma. Acta '-^to- 
laryng® 75: 220-226. 

Spoendlin, H. (1973): The innervation of the cochlear receptor. 
Proceeding of a Sympos. on:Basic mechanisms in hearing® 
Academic Press 1973 > pp* 185-234® 

Stange, G. , Holz, E. , Terayaraa, Y. and Beck, Chi. (1966): Korrela- 
tion morphologischer , biochemischer und elektrophysiologisch- 
er Untersuchungsergebnisse des akustischen Systems. Arch. klin. 
exper. Ohr-, Nas.-u.Kehlkopfheilk. 186: 229-246. 

Trahiotis, G. and Elliot, D.N.(1970): Behavioral investigation 
of some possible effects of sectioning the crossed olivo- 
cochlear bundle. J. Acoust .Soc.Amer. 47: pp. 592-596. 

Wiederhold, M.L. and Kiang, N.Y.S. (1970): Effects of electrical 
stimulation of the crossed olivo-cochlear bundle on single- 
auditory-nerve fibres in the cat. J. Acoust .Soc.Amer. 48/4: 
(II) 950-965® 




33 



COMMENTS ON: "Neu roan atomy of the cochlea" '(H. SPOENDLIN) 

R.R. PFEIFFER, C.E. MOLNAR, AND J.R. COX, JR. 

Dept, of Electrical Engineering, Washington University, Saint Louis, Mo. 

The difference between the percentages of innervation by afferent cochlear 
nerve fibers to the inner and outer hair cells, as described by Speondlin, 
has certainly provided a major impetus for identification of electro- 
physiological correlates. While it is true that Kiang (1965), in his study 
of cochlear nerve discharge patterns and properties, did not identify two 
different types of cochlear neurons, subsequent reports have (e.g. Pfeiffer 
and Kim, 1972a, 1972b). 

Our electrophysiological identification of populations is based solely on 
response patterns to click stimuli. The populations are identified by both 
qualitative characteristics of the response patterns and an unequivocal 
quantitative difference. 

Figure 1 shows how compound PST histograms of responses to click stimuli are 
composed for analysis. Figure 2 shows four histograms typical of the res- 
ponse pattern of population I fibers. Figure 3 shows histograms for two 
population II fibers. 

The qualitative difference lies in the shape of the envelope of response; 
population II fibers characteristically have a distinctly modulated en- 
velope. The quantitative difference lies in the total number of peaks in 
the compound PST histogram as a function of changes in the click-stimulus 
signal level. Figure 4. 

We have found in a sample of 907 fibers that: a) 93% are population I and 
7% population II (Spoendlin reports 95% of the afferent cochlear nerve 
fibers innervate inner hair cells and 5% outer hair cells); b) population II 
fibers appear to be more sensitive than population I fibers because of the 
extended duration of response (fibers innervating outer hair cells could 
be more sensitive because of the possibilities of spatial summation of re- 
ceptor signals); and c) population II fibers have response patterns con- 
sistent with summing small numbers of decaying sinusoidal inputs. Figure 5 
(each afferent fiber to outer hair cells receive inputs form several outer 
hair cells). Details of these observations and calculations can be found in 
Pfeiffer and Kim, 1972b. 

Whether or not the anatomical observations and the electrophysiological 
observations pertain to the same population division remains to be seen. 




34 



COMMENTS 




Figure 1. Poststimulus time (PST) histograms of responses to click stimuli, 
(a) PST histogram of responses to rarefaction clicks. (b) PST histogram 
of responses to condensation clicks for the same nerve fiber and for the 
same stimulus amplitude as for Part (a) . (c) A combination of Parts (a) 

and (b) (compound PST histogram) , except the histogram for condensation 
clicks (b) has been inverted and put in time registration with the 
histogram for rarefaction clicks (a). (From Goblick and Pfeiffer, 1969.) 



AN. 44 




Figure 2. Compound PST histograms for four different fibers from the same 
animal. Unit numbers are given to the left of each unit, the characteristic 
frequencies are given below each unit. The time scale is the same for each 
histogram. All histograms are in response to approximately 500 stimuli of 
each polarity. (From Pfeiffer and Kim, 1972b.) 




CHANGE in NUMBER of PEAKS in HISTOGRAM 



35 



COMMENTS 




Figure 3. Two examples of compound PST histograms from Population II 
fibers. The characteristic frequencies of the fibers are approximately 
(from left to right) 775 and 2000 respectively. 



25 




0 i 1 1 r— 

0 10 20 30 40 50 



CHANGE in SIGNAL LEVEL, dB 



Figure 4. Plot of the change in 
the number of peaks in the compound 
PST histogram versus change in the 
signal level for twelve different 
fibers, six from Population I 
(dashed lines) and six from Popu- 
lation II (solid lines) . (See 
Pfeiffer and Kim, 1972b, for details.) 




36 



COMMENTS, ADDITIONAL REMARKS 




f 



Figure 5. Two waveforms constructed by summing exponentially damped 
sinusoids that cover a narrow band of frequencies. The number of added 
waveforms is 5. These waveforms are shown to illustrate a variety of 
envelopes that can be obtained by adding small numbers of waveforms. 

These examples are first-order approximations to the histograms in 
Figure 4. Their horizontal time scales have not been adjusted. The 
normalized bandwidths of the oscillatory frequencies that make up the wave- 
forms shown, are approximately 0.33 and 0.16 respectively. First-order 
calculations based on data of Schuknecht (1960) state that for 
frequencies corresponding to the CF shown in Figure 4, the bandwidths 
suggested by the simulations would correspond to innervating from about 
0.3 to 1.1 mm of the basilar membrane under the assumption that each wave- 
form is derived from a single sensory cell. (See Pfeiffer and Kim, 1972b.) 

REFERENCES 

Kiang, N.Y.S., with the assistance of Watanabe, T. , Thomas, E.C., and Clark, 
L.C. (1965). ’’Discharge Patterns of Single Fibers in the Cat’s Auditory 
Nerve,” MIT Res. Monogr. 35. 

Pfeiffer, R.R. and Kim, D.O. (1972), ’’Anomalies of Response Patterns of Single 
Cochlear Nerve Fibers to Click Stimuli,” J. Acoust. Soc. Amer. 55 , 93(A) . 
Pfeiffer, R.R. and Kim, D.O. (1972). ’’Response Patterns of Single Nerve 

Fibers to Click Stimuli: Descriptions for Cat,” J. Acoust. Soc. Amer. 

1669 (B). 

ADDITIONAL REMARKS 

TONNDORF: Dr. Kohlloffel showed us yesterday unusual forms of spiral 
ggl. responses, a finding that suggested, at least to me, a possible 
correspondence to your anatomical observations. My question is if there 
are any indications for synaptic (or ephaptic) connections between your 
two sets of fibers beyond the point of Kohlloffel ’s recordings, i.e. 
below the habenula. 

SPOENDLIN: I looked carefully in many animals and have never found 

any evidence for such connections. 

EVANS: I am not so pessimistic that electrical interaction could not 

occur within the habenular between outer spiral fibres and inner radial 
fibres. We cannot dismiss the possibility that, via mediation of the 
habenular satellite cell, spikes propogated down the outer spiral fibres 
could initiate or influence spike generation in the inner radial fibres. 

Such electrical interaction has been demonstrated in spinal cord (P.G. 

Nelson), and I understand, in the eel lateral line system (E. Alneas , 

1973, Acta Physiol. Scand. , 87, 88). 




37 



NEDEOBIOLOGT OF HA.IR CELLS AND THEIR SOTAPSES 
1 . FLOCK 

King Gustaf V Research Institute, S-I 04 01 Stockholm 60 , Sweden 
In order to understand sensory processes in the organ of Corti basic 
knowledge is required about the physiology of excitation and inhibi- 
tion in hair cells in general. Such information must be obtained 
from hair cells and their innervating nerve fibres by a number of 
research methods appropriate to the questions asked. There is reason 
to believe that hair cells in the cochlea, in the vestibular system 
and in lateral line organs work according to the same basic prin- 
ciples, with the possible exception of synaptic transmission in 
mammalian vestibular Type I hair cells. Therefore the choice of 
preparation can be suited to the question asked and by the research 
methods that will be applied. 

The results described briefly below have been obtained from hair 
cells and synapses in lateral line canal organs in fish, in epidermal 
lateral line organs of salamander, in the basilar papilla of the 
bullfrog and in the crista ampullaris of the semicircular canal of 
the skate. The results will be arranged below under subject headings 
relevant to sensory processing in the organ of Corti. 

Electrical properties of hair cells 

Intracellular recordings have shown that hair cells in lateral line 

canal organs had membrane potentials of 10-65 cell input resist- 

2 

ances of 10-100 Mil a specific membrane resistance of 1 00-1 000 fl/cm , 

a time constant of <200 ^sec and a specific membrane capacitance of 
2 

about 0.5 |i.P/cm (Plock, Jorgensen and Russell, 19753 '). Due to tech- 
nical difficulties these values must be regarded as approximate. 




38 



Flock: NEUROBIOLOGY OF HilE CELLS 

However, when the efferent synapses were activated there was a change 
in membrane potential and resistance which showed that the measured 
values were largely due to viable biological membranes. Hair cells and 
supporting cells were identified by dye marking techniques. These 
values put hair cells in the same category as neurons and muscle 
fibres and mean that hair cells are electrically isolated units not 
coupled together by low resistance junctions. 

When excited by a low frequency tone hair cells generated a 
receptor potential with an amplitude which seldom exceeded 1 mT. 

When cells were electrically depolarized or hyperpo lari zed by as much 
as il5 mY by current injection in a bridge configuration, no resist- 
ance change could be detected. This means that even though hair cells 
can be passively depolarized or hyperpolarized by current passed 
across the reticular lamina, they are not electrically excitable in 
the same sense as neurons and muscle fibres are: they change t'heir 
resistance in response to a voltage drop across the membrane thus 
giving rise to nerve or muscle impulses. Impulses were never seen in 
hair cells. 

Hair cells are thus individual current generators which are 
mechanically but not electrically excitable. Signal transmission 
through them is not active but passive with a high fidelity. 

Synaptic transmission 

Evidence is now accumulating that synaptic transmission at afferent 
as well as efferent synapses is chemical; excitatory postsynaptic 
potentials have been recorded in afferent nerve fibres and terminals 
(Funikawa and Ishii, 19^7; Flock and Russell, 19733') inhibitory 

postsynaptic potentials have been recorded in hair cells when 




39 



Flock: HETOOBIOLOGY OF BAIR CELLS 

efferent nerve fibres are stimulated (Flock and Russell, 1973a). 

Efferent nerve endings are filled with synaptic vesicles and 
such vesicles are also present at the afferent synapse in hair cells 
where they surroimd an osmofilic structure, the synaptic body. Furu- 
kawa and Ishii ( 196 ?) showed that postsynaptic potentials were graded 
in a quantal fashion, suggesting that synaptic vesicles contain neuro- 
transmitter released in quantal amounts to give rise to miniature 
potentials which can be summed in space and time. In recordings from 
afferent terminals within the sensory epithelium of lateral line canal 
organs individual miniature potentials were seen to have different 
amplitudes (Flock et al. , 1973b). This implies that there are multiple 
release sites situated at different electrotonic distances from the 
recording point, and that spread in the non-myelinated portion of the 
nerve fibre is decremental. However, this may or may not be the case 
in the cochlea 5 , outer spiral fibres are extremely thin and long and 
may well carry nerve action potentials (Easton, 1965 )* 

The inhibitory postsynaptic potential in hair cells had a pro- 
tracted time course, it outlasted the firing of efferent fibres by 
1 50-200 msec (Flock and Russell, 1973a<, b) . Efferent inhibition can 
therefore not be used in a time -re solving capacity. 

The inhibitory postsynaptic potential is paralleled by a reduc- 
tion in the amplitude of excitatory postsynaptic potentials in affe- 
rent terminals and by a reduced probability of nerve impulse initia- 
tion. Both these effects are blocked by curare or flaxedil, which 
does not, however, affect the heigth of afferent excitatory post- 
synaptic potentials. This indicates that the afferent and efferent 
synapses use different transmitters. 




40 



Flock: NMJROBIOLOGY OF HAIR CELLS 
N euro transmit ter identity 

Neurons which use a certain transmitter have the characteristic abili- 
ty to synthesize this transmitter and store it in high concentration. 
It is possible to screen for synthesis of several transmitters by 
supplying isolated neural tissue with radioactively labelled pre- 
cursors for known transmitters and to determine which transmitters 
have been synthesized by radiochemical methods. This technique has 
been applied to inner ear and lateral line sense organs, looking for 
synthesis of noradrenalin , dopamine, y -aminobutyric acid, acetyl- 
choline and serotonin (Flock and Lam, I974). It was found that syn- 
thesis of acetylcholine coincided with presence of an efferent inner- 
vation (lateral line canal organ and skate crista ampullaris). All 
organs tested, including also the frog basilar papilla which receives 
only an afferent innervation, showed synthesis of y -aminobutyric acid. 
This was matched by demonstrating the presence of the synthesizing 
enzymes (cholineacetyl transferase and glutamic acid decarboxylase) in 
the appropriate places. Furthermore, it was found that afferent nerve 
firing could be blocked by low concentrations of picrotoxin, a speci- 
fic blocking agent for synapses using y -aminobutyric acid as trans- 
mitter. 

This confirms previous data implicating acetylcholine as the 
efferent transmitter and suggests the possibility that y -aminobutyric 
acid is the transmitter used by hair cells to excite sensory nerve 
fibres. 

Long term changes at synapses 

The synaptic body at the afferent synapse in hair cells has been shown 
to increase in size in response to acoustic stimulation (Frishkopf, 




41 



Flock; NEUROBIOLOGY OF HilR CELLS 

1973)* Also^ these Bodies change their number and position when 
observed in vivo in the tail of salamander tadpoles (Flock et al. , 

1 973b) . This implies that the synapse is perhaps capable of adjusting 
to demand by increasing or decreasing its capacity as needed, and 
points to the fact that hair cells are living things. 



ACKNOICLEDGEMENT 

This work has been supported by grants from the Swedish Medical 

Research Council (04X--246I), King Gustaf Y Memorial Fund and the 

Grass Fo-undation. 

REFERENCES 

Easton, D. (1965)- "Impulses at the artifactual nerve end." Cold 
Spring Harbor Symp. Exp. Biol. vol. XXX, pp. 15-28. 

Flock, 1. and Lam, D. (1974)* "Biosynthesis of neurotransmitters in 
the inner ear and lateral line." Nature. Submitted for publication. 

Flock, 1. and Russell, I. ( 1973a). "Postsynaptic action of efferent 
fibres on hair cells." Nature, N.B. 243 > 89-91* 

Flock, 1. and Russell, I. (I973b). "The postsynaptic action of 

efferent fibres in the lateral line organ of the burbot Lota lota .” 
J . Physiol . 2^5 591 -605 . 

Flock, A., Jorgensen, J. and Russell, I. ( 1973a). "Passive electrical 
properties of hair cells and supporting cells in the lateral line 
canal organ.” Acta Otolaryng (Stockh.) I9O-I98. 

Flock, A., Jorgensen, J. and Russell, I. (I973h). "The physiology of 
individual hair cells and their synapses." In; Basic Mechanisms in 
Hearing. Ed. A. Miller. Academic Press, New York, pp. 273-506. 




42 



Flock: NEUROBIOLOGY OF HilR CELLS 

Frishkopf, L. (1974)- "Effects of stimulation on synaptic body size 
in auditory hair cells in the Frog." Brain, Behavior and Evolution. 
In preparation. 

Furukawa, T. and Ishii, Y. (196?)« "Neurophysiological studies on 
hearing in goldfish. J. Neurophysiol. 30, 1377-1403. 



ADDITIONAL REMARKS 

TONNDORF: Is there a synaptic bar in any other neural structure? 

FLOCK: Yes, there is. Presynaptic structures similar to those in 

the inner ear cells are also present in photoreceptor cells and in 
electroreceptors . 

SCHWARTZKOPFF : The current you have shown to pass through the hair 
cells is modulated by the shearing of the stereo-cilia. The 
depolarizing phase indicates an increasing current and excitation. 
Would you agree to call the opposite phase - decreasing current - 
inhibitory? 

FLOCK: Yes, I would agree to this. A second type of inhibition would 
be through the efferents. 




. Cochlear Mechanismus 




45 



MEASUREMENTS OF SOUND PRESSURE IN THE COCHLEAE OF ANESTHETIZED 
CATS 

V. NEDZELNITSKY 

Research Laboratory of Electronics, Massachusetts Institute of Technology, 
Cambridge, and Eaton-Peabody Laboratory of Auditory Physiology, Massachusetts 
Eye and Ear Infirmary, Boston, Mass., U-S.A. 

INTRO DUCT ION: It Is generally presumed that motion of the ossicles In the 
mammalian middle ear produces sound pressure In the cochlear fluids, and that 
differences In sound pressure across the cochlear partition result In Its motion. 

Measurements of motion of the cochlear partition have established that this 
motion Is frequency selective (von Bekesy, 1960; Johnstone and Boyle, 1967; 

Rhode, 1971; Kohlloffel, 1972; Wilson and Johnstone, 1972). The physical basis 
for this frequency selectivity has been assumed to be the variation of the mechanical 
properties of the partition as a function of position along the cochlear spiral. 
However, only static mechanical properties of the partition have been measured 
(von Bekesy, 1960). 

In order to determine the dynamic mechanical properties of the cochlear 
partition. It Is necessary to measure both the pressure differences acting on this 
partition and Its motion. Systematic measurements of Intracochlear sound pressure 
at audio frequencies have not been made previously, although an earlier study 
(Burgeat ef aj_. , 1963) Indicated that such measurements should be possible. The 
present paper reports sound-pressure measurements In the perilymphatic scalae of 
the basal turn In anesthetized cats. 

METHODS ; The pressure measurements were made with a probe microphone (Fig. 1) 
consisting of a fluid-filled probe tube leading to a small (volume < .005 cm^) fluid- 
filled cavity terminated by the diaphragm of a piezoelectric pressure -transducer. 
Probe microphones were calibrated over the frequency range of measurement before 
and after each experiment (Schloss and Strasberg, 1962). For all experiments 
reported here, the two calibration curves agreed within 5 dB. Technical limitations 
restricted the frequency range of pressure measurements to 15- 12 000 Hz. At 
frequencies between 1 000 and 3 000 Hz, It was possible to measure sound pressure 




46 



Nedzelnitsky: SOUND PRESSURE IN COCHLEAE 

in scala vestibuli for stimulus levels of 40 to 105 dB SPL at the tympanic membrane 
and in scala tympani for 75 to 105 dB SPL* 



FLUID-FILLED CAVITY AND TIP LEADS TO BAR 




Fig* 1: Schematic diagram of the pressure probe microphone (not to scale)* 



Measurements were performed with the bulla opened widely and the bony 
septum removed* A closed acoustic system (Kiang et al, 1965) was used to generate 
and to measure stimulus sound pressure near the tympanic membrane* Holes 
(approximately 0*4 mm in diameter) were drilled into scala tympani and scala 
vestibuli* Probe microphone tips were inserted into the holes and sealed into the 
cochlea* 

On the basis of measurements with the probe tip plugged and sealed into 
place, it was concluded that the unplugged probe indeed responded to sound 
pressure at its tip* From measurements of cochlear potentials in response to sound, 
it was surmised that the experimental procedures did not have large effects on 
cochlear pressures* Sound pressures measured before and after interrupting the 
incudo-stapedial joint showed that the measured pressures were produced primarily 
by stapes motion conducted via the ossicular chain* In general, measurements of 
sound pressure were stable throughout experiments lasting many hours* 

RESULTS : Intracochlear sound pressures were measured in 29 cats* Figure 2 shows 
results from a cat in which sound pressures in scala vestibuli and scala tympani of 
the same cochlea were measured as functions of frequency for a constant sound 




47 



Nedzelnitsky: SOUND PRESSURE IN COCHLEAE 



ly 

z 




CVJ 

O 

o _ 

Q CO 



m yj 
o z 

- s 

X 

t ^ 

5 8 

CO 

UJ 

§ S 

CO 

CO CD 



UJ 

Q 

3 



e? 

< 

2 



X 

o 

o 

o 

< 




z 



Fig» 2; Magnitude and angle of sound pressures in the basal turn of a cat cochlea- 
Shown are the pressure in scala vestibuli (solid lines), the pressure in scala tympani 
(solid lines with circles), and pressure difference across the cochlear partition (dotted 
lines). The pressure difference (sound pressure in scala vestibuli minus sound pressure 
in scala tympani) deviates from the pressure in scala vestibuli only at low frequencies. 
Here and in Fig. 3, curves are based on data points evenly spaced on the logarithmic 
frequency scale, with 12 points per octave. A positive angle denotes phase lead of 
intracochlear sound pressure relative to sound pressure at the tympanic membrane. 




48 



Nedzelnitsky: SOUND PRESSURE IN COCHLEAE 

pressure level at the tympanic membrane. As frequency increases from 30 Hz, the 
magnitude of sound pressure in scala vestibuli increases to a maximum near 1 000 Hz. 
The value of this maximum exceeds sound pressure at the tympanic membrane (by 
about 30 dB), indicating that at these frequencies the middle ear provides pressure 
gain. Above 1 000 Hz, the magnitude of sound pressure in scala vestibuli tends to 
decrease with increasing frequency. In scala tympani for frequencies below 400 to 
500 Hz, the sound pressure is constant in magnitude (± 5 dB) and angle (± 35 degrees) 
and is in phase (+ 40, -30 degrees) with the sound pressure at the tympanic 
membrane. At frequencies above 500 Hz, sound pressure in scala tympani tends to 
increase in magnitude, and its angle is no longer independent of frequency. 

As frequency decreases below 40 Hz, the sound pressures in scala vestibuli 
and scala tympani become equal in magnitude (within 6 dB). Preliminary results 
indicate that the angle of sound pressure in scala vestibuli approaches the angle of 
sound pressure in scala tympani at these low frequencies. 

Over the frequency range 100-5 000 Hz, the sound -pressure magnitude in 
scala vestibuli exceeds that in scala tympani by at least 15 dB. It follows that the 
pressure difference across the cochlear partition is equal (within 1 to 2 dB) to the 
sound pressure in scala vestibuli for these frequencies (Fig. 2). 

For single tones, sound pressures in scala vestibuli and scala tympani are 
sinusoidal and linearly related to stimulus sound -pressure at the tympanic membrane 
(at least in the sense that a 15 dB ± 1 dB change in stimulus amplitude results in a 
15 dB ± 1 dB change in response amplitude). This appears to hold throughout the 
frequency range 20-12 000 Hz for stimulus levels up to at least 105 dB SPL* 
Consequently, the relation between sound pressure at the tympanic membrane and 
sound-pressure difference across the cochlear partition may be expressed as a 
transfer function (Fig. 3). 

DISCUSSION : The sound pressure in scala tympani at frequencies from 15 to 400 or 
500 Hz can be given a simple physical interpretation. At these frequencies, volume 
displaced by the stapes footplate in cat is proportional to sound pressure at the 
tympanic membrane (Guinan and Peake, 1967). If one assumes the volume displace- 
ment of the round-window membrane to be equal to that of the stapes footplate (with 




49 



Nedzelnitsky: SOUND PRESSURE IN COCHLEAE 



FREQUENCY (Hz) 

10 100 1000 10 000 




Fig* 3: Magnitude and angle of the transfer function; sound pressure across the 
cochlear partition/sound pressure at the tympanic membrane. Measurements from 
six cats are shown* In four cats, probe tips were simultaneously situated in each 
scala, and measurements in both scalae were performed in approximately 15 minutes* 
In two cats, a single probe was used to measure pressure first in scala tympani, then, 
several hours later, in scala vestibuli* 




50 



Nedzelnitsky: SOUND PRESSURE IN COCHLEAE 

the round -window membrane bulging outward as the stapes footplate is displaced 
into the cochlea), then the sound pressure in scala tympani at frequencies below 
400 to 500 Hz would be proportional to and in phase with the volume displacement 
of the round-window membrane. For these frequencies, the behavior of the round- 
window membrane would be predominantly that of an acoustic compliance. 

At frequencies below 30 Hz, the sound pressures in scala vestibuli and scala 
tympani are experimentally shown to be within 6 dB in magnitude, and appear to be 
in phase with sound pressure at the tympanic membrane. Thus at these very low 
frequencies, the sound pressures in both scalae behave as if they are determined by 
the compliance of the round-window membrane. At higher frequencies, this 
membrane is no longer the most significant factor, and a pressure difference can be 
measured across the cochlear partition. 

The frequency dependence of this pressure difference can be compared with 
the behavior of auditory -nerve fibers that innervate the region of the cochlea in 
which pressures were measured. The probe tip positions along the cochlea 
correspond to frequencies between 10 and 15 kHz on Schuknecht's (1960) map 
relating frequency to distance along the basilar membrane. Auditory -nerve fibers 
with characteristic frequency (CF) falling roughly between 10 and 15 kHz presumably 
innervate this part of the cat cochlea (Kiang, Moxon, and Levine, 1970). 

If the magnitude curves of Fig. 3 are inverted to show the ratio of sound 
pressure at the tympanic membrane to sound -pressure difference across the cochlear 
partition, the shapes of the resulting curves can be compared with the shapes of the 
low-frequency portions of "tuning curves" for auditory -nerve fibers with high CF 
(Fig. 4). For the stimulus levels of Fig. 4 and at frequencies below 2 to 3 kHz, 
constant discharge rate of these fibers apparently corresponds to constant-amplitude 
sound -pressure difference across the cochlear partition- 

Relating nerve-fiber activity to motion of the cochlear partition requires 
presently unavailable knowledge of the dynamic mechanical properties of the 
partition. For example, if these properties are dominated by locally determined 
stiffness, then the displacement of a localized region of the partition is proportional 
to the pressure difference across it- A constant discharge rate would then correspond 
to constant displacement amplitude. If, on the other hand, these dynamic properties 




51 



Nedzelnitsky: SOUND PRESSURE IN COCHLEAE 




I I L-J 1 1 i LI I I I J 1 1 L 11 1 

0.1 1.0 10.0 
FREQUENCY (kHz) 



Fig» 4; Comparison of intracochlear sound pressure data with tuning curves of 
auditory -nerve fibers. Tuning curves of four auditory -nerve fibers (data from Kiang 
and Moxon, 1974) of different characteristic frequency (CF) are plotted with a 
sliding vertical axis so that the dashed lines give the 90 dB SPL reference for each 
curve. A median curve of pressure difference across the cochlear partition was 
computed from the data of Fig. 3 and used to show the sound pressure at the tympanic 
membrane required to maintain the sound pressure difference across the cochlear 
partition constant in amplitude as frequency is varied. For purposes of comparison, 
the vertical position of each of the four such median curves plotted above was 
adjusted to a different absolute level in order to intersect a given tuning curve near 
1 kHz. 




52 



Nedzelnitsky: SOUND PRESSURE IN COCHLEAE 

are dominated by locally determined resistance (frictional effects), the velocity of 
a localized region of the partition is proportional to the pressure difference across 
it, and a constant discharge rate would correspond to constant velocity amplitude. 

It may also be that some combination of these or other characteristics would provide 
a more accurate formulation of the relationship between cochlear mechanics and 
neural responses. It is possible (but by no means certain) that if data on both motion 
of the cochlear partition and pressure difference across it were available, simple 
concepts might suffice to describe the mechanical properties of the partition and 
their relationship to auditory -nerve fiber responses, at least for frequencies well 
below CF. 

For frequencies that are near CF, the mechanical properties of the cochlear 
partition and their relationship to auditory -nerve fiber responses may be more 
complicated. At frequencies above 3 kHz, the tuning curves of Fig. 4 diverge from 
the pressure curves; the tuning curves for the lower CF units begin to diverge at 
lower frequencies than the curves for higher CF units. The pressure data of Fig. 4 
do not extend to frequencies including CF- Techniques for extending the measure- 
ments to higher frequencies or more apical locations must be developed before 
combined measurements of pressure difference and motion can be used to determine 
the dynamic properties of the cochlear partition over the entire range of interest. 
Theories of cochlear mechanics based on these properties may help to explain how 
the hydromechanical properties of the cochlea contribute to the frequency 
selectivity represented in the tuning curves of auditory -nerve fibers. 

ACKNOWLEDGMENTS; T. F. Weiss has made invaluable contributions to this work. 
W. T. Peake and N. Y* S. Kiang have provided helpful suggestions and criticisms. 
The author also wishes to thank D. W. Altmann, E. M. Marr, J. B. Miller, E. L. 
Morgenstern, S. E. Moxon, S. A. Mrose, B. E. Norris, C. L. Pike, andG* S. 
Roberts for their contributions to the conduct of experiments and preparation of the 
manuscript. This work was supported by U. S. Public Health Service Grants 
5 ROl NS01344, 5 TOl GM01555, 5 POl GM14940, and 1 ROl NS 11000. 




53 



Nedzelnitsky; SOUND PRESSURE IN COCHLEAE 

REFERENCES ; 

Bekesy, G- von (1960). Experiments in Hearing (McGraw-Hill, New York). 

Burgeat, Burgeat-Menguy, C«, and Lehmann, R. (1963)- "Etude de la Trans- 
mission de Pressions Acoustiques du Milieu Aerien aux Liquides Cochleares," 
Acustica 12, 377-382. 

Guinan, J-J-, and Peake, W-T- (1967). "Middle -Ear Characteristics of 
Anesthetized Cats," J. Acoust. Soc- Am- 41, 1237-1261- 

Johnstone, B-M-, and Boyle, A-J-F- (1967). "Basilar Membrane Vibration 
Examined with the Mossbauer Technique," Science 158, 389-390- 

Kiang, N-Y-S-, and Moxon, E-C- (1974). "Tails of Tuning Curves of Auditory - 
Nerve Fibers," J- Acoust. Soc. Am., in press. 

Kiang, N-Y-S., Moxon, E-C-, and Levine, R«A. (1970). "Auditory -Nerve 
Activity in Cats with Normal and Abnormal Cochleas," in Sensorineural 
Hearing Loss , G- E. W« Wolstenholme and J. Knight, eds- (J- and A. 
Churchill, Great Britain), pp. 241-273. 

Kiang, N-Y-S., Watanabe, T-, Thomas, E.C., and Clark, L-F- (1965). Discharge 
Patterns of Single Fibers in the Cat*s Auditory Nerve (M-l-T- Press, 

Cambridge, Mass.). 

Kohlloffel, L. U- E. (1972). "A Study of Basilar Membrane Vibrations," Acustica 
27, 49-89. 

Rhode, W- S- (1971). "Observations of the Basilar Membrane in Squirrel Monkeys 
Using the Mossbauer Technique, " J. Acoust. Soc. Am- 49, 1218-1231. 

Schloss, F-, and Strasberg, M. (1962). "Hydrophone Calibration in a Vibrating 
Column of Liquid," J. Acoust. Soc. Am- 34, 958-960. 

Schuknecht, H- F- (1960). "Neuroanatomical Correlates of Auditory Sensitivity and 
Pitch Discrimination in the Cat," in Neural Mechanisms of the Auditory and 
Vestibular Systems, G- L- Rasmussen and W. Windle, eds- (Charles C. Thomas, 
Springfield, III-), pp. 105-115. 

Wilson, J. P-, and Johnstone, J. R. (1972). "Capacitive Probe Measures of Basilar 
Membrane Vibration," Symposium on Hearing Theory, l-P-O- Eindhoven, 

Hoi land- 



54 



COMMENTS ON: "Measurements of sound pressure in the cochleae of 
anesthetized cats" (V.NEDZELNITSKY) 

P.DALLOS 

Auditory Physiology Laboratory, Northwestern University, Evanston, 111. 

I would like to utilize Dr. Nedzelnitsky's excellent results to 
provide yet one more indication of how well differentially recorded 
cochlear microphonic potentials (CM) reflect the hydromechanical 
phenomena that precede them. In the attached figure. Dr. Nedzelnitsky's 
Fig. 3 is reproduced and some CM data are superimposed. The latter show 

CM sensitivity (SPL at 3pV CM, ar- 
bitrary vertical reference point) 
and CM phase referred to the phase 
of the sound field at the eardrum. 
The data points are for one animal 
that best matched the median curves 
published previously (Dallos, 1970). 
The CM data were obtained with 
differential electrodes in the basal 
turn of the cat, the bulla was 
opened and the septum was removed. 
The agreement between the CM and the 
pressure difference data is quite 
remarkable, indicating that the nor- 
mal CM appropriately reflects (at 
least at low sound levels) the displacement of the cochlear partition, 
which is proportional to the pressure differential across it. 

These data also indicate that in cat the low frequency asymptote of 
the pressure differential vs frequency function is 12 dB/octave and the 
phase approaches 180^. I have suggested (Dallos, 1970) that this low 
frequency behavior is directly attributable to the effect of the heli- 
cotrema. In the cat, whose helicotrema is large, an effective pressure 
equalization between the perilymphatic scalae takes place up to 
relatively high frequencies (100-200 Hz). In other animals with small 
helicotremas (e.g. guinea pig) such pressure equalization is not sig- 
nificant in the audio range. Such species show low frequency behavior 



FREQUENCY 





55 



COMMENTS, ADDITIONAL REMARKS 



characterized by 6 dB/octave slope of the CM magnitude function and 90° 
phase lead. 



Reference: 

Dali os, P. (1970) "Low Frequency Auditory Characteristics: Species 
Dependence," J. Acoust. Soc. Amer. 48, 489-499. 

ADDITIONAL REMARKS 

NEDZELNITSKY : I am pleasd that Dr. Dallos has shown a comparison of some 

of his CM data (Dallos, 1970) with data on the pressure difference across 
the cochlear partition. His model of the low-frequency hydromechanical 
behavior of the cochlea (Dallos, 1970) relates these data to the effect of 
the helicotrema upon the cochlear acoustic input impedance. This impedance 
was determined at low frequencies (Nedzelnitsky , 1974) from measurements 
of sound pressure in the basal turn in scala vestibuli and the average 
middle-ear transfer ratio (Guinan and Peake, 1967). A median impedance 
curve was calculated using pressure data from 25 cats. At frequencies from 
30 to 700 Hz, the angle of this median impedance is zero within ± 20 degrees. 
Its magnitude shows frequency dependence, but from 30 to 700 Hz the upper 
and lower bounds on this magnitude are only 13 dB apart. At these 
frequencies , the measured cochlear input impedance is best described as 
a frequency-dependent resistance. According to the model of Dr. Dallos, 
the cochlear input impedance contains a significant reactive component at 
low frequencies, which is manifested by a high-pass filter characteristic. 

I agree with Dr. Dallos that the helicotrema has an important effect on 
the low-frequency hydromechanical behavior of the cochlea. However, further 
theoretical and experimental study is required to fit all of the pressure 
data obtained thus far and to relate them to other low-frequency auditory 
characteristics of cat. 

REFERENCES 

Dallos, P. (1970). "Low Frequency Auditory Characteristics: Species 
Dependence," J. Acoust. Soc. Amer. 489-499. 

Guinan, J.J. , Jr. and Peake, W.T. (1967). "Middle-Ear Characteristics of 
Anesthetized Cats," J. Acoust. Soc. Amer. 1237-1261. 

Nedzelnitsky, V. (1974). "Measurements of Sound Pressure in the Cochleae 
of Anesthetized Cats," Sc.D. thesis, MIT, Department of Electrical 
Engineering. 




56 



BASILAR MEMBRANE VIBRATION DATA AND THEIR RELATION TO THEORIES OF 
FREQUENCY ANALYSIS 

J.P. WILSON 

Department of Communication, University of Keele, Staffs, England. 



Four different techniques have now been used in making measurements 
of basilar membrane (BM) motion in different parts of the cochlea and in 
different species. This paper will compare and contrast the data obtained 
and will attempt to relate them with models of cochlear frequency 
analysis. 




Fig. I BM / STAPES RATIO 



57 



Wilson: BASILAR MEMBRANE DATA 

Fig, 1 compares the shapes of BM/stapes ratio frequency response 
curves up to their frequencies of cut-off. The curves have been shifted 
vertically to aid comparison: (a) is from B^kdsy (1960, fig. 12-23) for 
guinea pig, (b) is from Wilson and J.R. Johnstone (1972, in prep., see 
Wilson^ 1973, for method) also from guinea pig, (c) is from Rhode (1971 § 
1973, live data) in squirrel monkey, and (d) is from B.M. Johnstone § 

Boyle (1967) Johnstone, Taylor, § Boyle, (1970) and Johnstone § Taylor 
(1970) from guinea pig. The curves from Wilson § Johnstone are 
continuous plots and contain spurious peaks in the 4-8 kHz range due to 
sharp notches in the incus responses, whereas smoothed curves are shown 
for other sources. 

Several features of these data can be observed: (1) all low-frequency 
cut-offs tend to a value of 6dB/octave (lower dashed line segments) below 
3-4 kHz; (2) above 3-4 kHz low frequency cut-offs approximate 12dB/octave 
(upper dashes ) ; (3) the responses tend to rise slowly by a few dB above 
these baselines as the peak is approached. This region of slope increase 
appears at a fixed frequency (3.5 kHz) in the data of Wilson § Johnstone 
regardless of cut-off frequency in the range 15-43 kHz, and is followed by 
a local increase in slope to 25dB/octave from 4 to 5 kHz. It is possible 
that this may correspond with the high value of low frequency slope found 
(at a slightly higher frequency) in squirrel monkey by Rhode. This view 
is supported in Rhode’s own data by the change of shape of response between 
positions T1 I T2 in his fig. 8 (1971) and in the change in shape post- 
mortem due to a progressive lowering of cut-off frequency to below this 
critical region (1973). 

Fig. 2 compares basilar membrane and differential cochlear 
microphonic (CM) responses at a constant SPL at the tympanic membrane in 
guinea pig with open bulla: (a) shows individual continuous plots from 
Wilson § Johnstone for cut-off frequencies in the 20-25 kHz range 
superimposed at 1 kHz. The continuous line in (b) gives the average of 
(a) above with typical rather than average high frequency response. 

Curves A and B are for the most apical and basal placements measured 
respectively. The coarse dashed line is the average from Kohlloffel (1972) 
using laser fuzziness-detection, Tl, T2, and T3 represent CM responses for 
the three cochlear turns from Dallos (1973) . The dotted curve from Bdkdsy 
(loc. cit.) has been corrected for open bulla. All curves have been 




58 



Wilson: BASILAR MEMBRANE DATA 




Fig. 2 BM a CM RESPONSES AT CONSTANT SPL 

shifted vertically to coincide at low frequencies. 

It is remarkable how closely the BM and CM functions coincide in all 
turns up to their respective cut-off frequencies, and would appear to 
support the simple Davis (1957) model of cochlear microphonic generation 
with voltage proportional to basilar membrane displacement without any 
significant derivative component. 

The high frequency slopes in the CM responses are much less than for 
the BM owing to the poorer spatial resolution both in the basal turn and 
for T3 compared with Bdkdsy’s curve. Kohlloffel (1971) ^ however, derived 
a spatial weighting function for monopolar CM electrodes in the basal turn 
and found that the distribution pattern along an array of 12 electrodes 
was consistent with basilar membrane displacement. 

Fig. 3 compares data from the BM with single cochlear nerve fibre (cn) 
threshold curves (Evans, 1972) collected under, or corrected to. closed 
bulla conditions. The BM curves are also plotted as ”iso-response" curves 




59 



Wilson: BASILAR MEMBRANE DATA 




Fig. 3 MECHANICAL AND FIBRE THRESHOLDS 

positioned according to the minimum SPLs used in determining them. 

Critical points on the low frequency slope, peak, high frequency slope, 
and plateau region of the capacitive probe responses of Wilson § 

Johnstone were measured at these levels and each found to be linear up to 
llOdB SPL or higher. Intermediate segments of the curve were therefore 
extrapolated downwards from continuous plots at higher linear levels. 

For the Mossbauer measurements of Rhode, levels are correct at the peaks 
but are greater at lower frequencies due to the limited velocity range of 
the method. Although B^k^sy does not give specific details for the guinea 
pig it would appear from other measurements that the levels must have been 
in the 120-140dB SPL range. His curves have been derived from his fig. 12- 
23 using our own middle ear curves. The data of Johnstone ^ al, although 
not shown on this fig., would appear parallel to our own curve with lowest 
’threshold* at 60-70dB SPL. 




60 



Wilson: BASILAR MEMBRANE DATA 

It is apparent again that there is a considerable measure of 
agreement between the shapes of all the mechanical curves up to their 
respective cut-off frequencies. It should be noted that, of the two 
curves from Rhode, it is the one taken at lower SPLs with the sharper tip 
which runs more parallel with the guinea pig data of Wilson | Johnstone. 
The higher low frequency slope just below the tip would appear to be 
partly a function of the more pronounced notch (inverted in this fig.) in 
the squirrel monkey curve (occurring at 5 kHz compared with 4 kHz for the 
guinea pig) . 

The contrast between the mechanical data and the single nerve fibre 
data, however, is remarkable and has been commented on previously (Evans, 
1970, 1972; Evans § Wilson, 1973). It should be emphasised here that 
this difference in shape is not due to different sound levels in the two 
types of measurement: the minimum sound level used by Wilson § Johnstone 
was actually below the minimum threshold for the sharp fibre curve at a 
comparable frequency (although this does not necessarily represent the 
best achievable fibre response at this frequency) . Furthermore the 
difference cannot be due to the electrophysiological state of the cochlea 
because N1 thresholds for clicks were as low during the mechanical 
measurements as in the single fibre experiments. Finally it cannot be due 
to the surgical procedures or draining of the cochlea because Evans (1970) 
has demonstrated sharp low threshold responses in this frequency region 
under identical conditions of exposure. 




50 100 200 500 1 2 5 10 20 50 



^•9- ^ Hz Cut-off frequency kHz 




61 



Wilson: BASILAR MEMBRANE DATA 



Fig. 4 gives values of BM displacement at the response peak plotted 
as a function of cut-off frequency (a) relative to stapes displacement 
and (b) for a constant SPL at the tympanic membrane of, left, lOOdB and, 
right, 0 dB from the same sources as figs. 1-3 (corrected to the 
appropriate conditions). Bdkdsey’s data, however, is for man (1960, figs. 
11-50 and 6-43) . It appears that Bdkdsy does not provide a direct 
measurement of amplitude at constant SPL, therefore (b) shows his derived 
values from fig. 6-43 (continuous line), and corrected to pressure at the 
eardriM and displacements at the basilar rather than Reissner’s membrane 
(dotted line) . It will be noted that there is a very close agreement 
between BM/ stapes ratios for the three sets of basal turn data from 
guinea pig and squirrel monkey, and that at these high cut-off frequencies 
the ratios do not appear to depend greatly on frequency. 




Fig. 5 bm frequency responses and travelling waves 

The dependence of the maximum BM/stapes ratio on cut-off frequency 
is required in order to derive the travelling wave envelope from the 
frequency response curve. This is illustrated diagramatically in fig. 5 
for a constant ratio of 30dB down to 1.25 kHz and a 4dB/octave (i.e. 4dB/ 
2.5mm) fall off below that (see fig. 4). The figures at the top indicate, 
left, the positions (mm from the basal end of the BM) , and right, the 
frequencies (kHz), of measurement. Figures have been rounded off for 
simplicity in the figure and do not represent the exact length or 
frequency limits of the guinea pig cochlea. Only one high frequency slope 
has been included for clarity. The decline in BM/stapes ratio below 1.25 



62 



Wilson: BASILAR MEMBRANE DATA 



kHz represents the rate (4dB/oct) but not the absolute values found by 
Bdkdsy. A fall in excess of 6dB/octave would lead to a constant *peak* in 
the travelling wave at 12. 5mm regardless of cut-off for frequencies below 
1.25 kHz. The only data available on the travelling wave envelope at the 
basal end of the cochlea are those of Kohlloffel (1972) who found basal 
slopes of 3.1-5.6dB/mm for frequencies in the range 4-14 kHz which is 
consistent with the value 4.8dB/mm (12dB/oct) used in fig. 5 for this 
region. As this corresponds with the average frequency response slope 
found by Johnstone et al and Wilson § Johnstone for frequencies above 3.5 
kHz, these data are consistent with a constant maximum BM/stapes ratio at 
higher frequencies. 

Minimum wavelengths in the instantaneous travelling wave calculated 
from the phase data of B6k6sy (1960), Rhode (1971), and Wilson § Johnstone 
(1972) are all within the range 0.7-2mm, indicating that the predominant 
shearing component should be radial and that the Mossbauer sources and 
capacitive probe tips currently in use are sufficiently small to measure 
basilar membrane motion accurately. 

In view of the data discussed above it is possible to reject Rhode’s 
(1971) suggestion that at threshold neural and mechanical responses might 
coincide. If, however, the sharp tip in his response curve is not merely 
due to an unfortunate choice of frequency region and can be confirmed at 
other positions, another possibility could be suggested. Perhaps the 
actual hair cell excitation function has a much sharper comer to its low 
pass characteristic than measured on the lower surface of the basilar 
membrane, which is then turned into a narrow band-pass characteristic by 
a differencing or interactive mechanism operating at signal frequencies 
(and therefore presumably not via the outer spiral fibres) . This would 
not be inconsistent with the CM findings because those already demonstrate 
a rounding of the comer of BM response curve due to spatial averaging 
(see fig. 2). It may in this way be possible to explain the low sound 
levels at which Rhode finds non linearity if, in this region of the 
cochlea, BM motion is coupled strongly to a non-«*linear tectorial membrane 
or via non-linear hair cell action. 

Such a model, however, could probably not account for the properties 
of the cubic difference tone ( 2 fj^-f 2 ) which is not present in basilar 
membrane motion (Wilson § Johnstone, 1973) or in the cochlear microphonic 




63 



Wilson: BASILAR MEMBRANE DATA 

(Dallos, 1969) and ignores the possibly different mode of action of the 
inner hair cells, from which most afferent fibres arise (Spoendlin, 1972). 
At this stage it is probably not possible to decide whether the inner 
hair cell is stimulated by a much narrower range of frequencies due to 
hydromechanical action or whether the inner hair cell has conventional 
transducer action and some further frequency selective properties of its 
own due to electrochemical or membrane properties. 



References 

B^k^sy, G. von (1960) Experiments in Hearing , McGraw-Hill. New York. 

Dallos, P. (1969) Combination tone 2f|-f2 in microphonic potentials. 

J. Acoust. Soc. Amer . 46, 1437-1444. 

Dallos, P. (1973) Cochlear potentials and cochlear mechanics. In Basic 
Mechanisms in Hearing, Ed. A.R. Miller. Academic Press. New York 
335-376. 

Davis, H. (1957) Biophysics and physiology of the inner ear. Physiol. Rev. 

1-49. 

Evans, E.F. (1970) Narrow "tuning” of the responses of cochlear nerve 

fibres emanating from the exposed basilar membrane. J. Physiol. 208, 
75-76P. 

Evans, E.F. (1972) The frequency response and other properties of single 
nerve fibres in the guinea pig cochlea. J. Physiol . 226, 263-287. 

Evans, E.F. § Wilson, J.P. (1973) The frequency selectivity of the cochlea. 
In Basic Mechanisms in Hearing, Ed. A.R. Mj^ller. Academic Press. New 
York, 519-554. 

Johnstone, B.M. § Boyle, A.J.F. (1967) Basilar membrane vibration 
examined with the Mossbauer technique. Science , 158 , 389-390. 

Johnstone, B.M., Taylor, K.J. § Boyle, A.J. (1970) Mechanics of the guinea 
pig cochlea. J. Acoust. Soc. Amer ., 47 , 504-509. 

Johnstone, B.M., § Taylor, K.J. (1970) Mechanical aspects of cochlear 

function. In Frequency Analysis § Periodicity Detection in Hearing , 

Ed. R. Plomp § G.F. Smoorenburg Sijthoff, Leiden, 81-93. 

Kohllbffel, L.U.E. (1971) Studies of the distribution of cochlear 

potentials along the basilar membrane. Acta Otolaryngol. Suppl . 288 . 

Kohlloffel, L.U.E. (1972) A study of basilar membrane vibrations, I, II, 
Acustica 27, 49-89. 

Rhode, W.S. (1971) Observations of the vibration of the basilar membrane 
in squirrel monkeys using the MBssbauer technique. J. Acoust. Soc. 
Amer . 1218-1231, 

Rhode, W.S. (1973). An investigation of post-mortem cochlear mechanics 

using the Mossbauer effect. In Basic Mechanisms in Hearing , Ed. A.R. 
Miller, Academic Press, New York, 49-68. 

Wilson, J.P. (1973) A sub-miniature capacitive probe for vibration 

measurements of the basilar membrane. J. Sound Vib . 30, 483-493. 

Wilson, J.P. § Johnstone, J.R. (1972) Capacitive probe measures of 

basilar membrane vibration. In Hearing Theory 1972, IPO Eindhoven, 
172-181. 

Wilson, J.P. § Johnstone, J.R. (1973) Basilar membrane correlates of the 
combination tone Nature, 241 , 206-207. 




64 



COMMENTS ON; ’’Basilar membrane vibration data and their relation to 
theories of frequency analysis” (J, P. WILSON) 

L. ROBLES 

Department of Neurophysiology, University of Wisconsin, Madison, Wisconsin 



It Is worth mentioning that in a paper recently submitted for 
publication, Gelsler et al . compare the basilar membrane response 
obtained by Rhode with single auditory nerve fiber data from squirrel 
monkeys obtained by them. From their comparison of iso-amplitude 
contours of basilar membrane displacement with iso-rate contours of 
auditory nerve fibers having best frequencies in the same frequency 
range, they concluded that, even though there are striking similarities 
between both sets of data, the neural data can not be explained directly 
by the displacement response of the basilar membrane. They suggest, as 
Dr. Wilson in this paper, that some transformation of the basilar mem- 
brane displacement must be involved in the generation of the neural 
responses. 

Gelsler, C.D., Rhode, W.S., and Kennedy, D.T. "The Responses to Tonal 
Stimuli of Single Auditory Nerve Fibers and their Relationship to 
Basilar Membrane Motion in the Squirrel Monkey,” Neurophysiol . 
(submitted) . 



ADDITIONAL REMARKS 

TONNDORF: With reference to your calculations of traveling- wave 

wave length, I did the same thing using smoothed calculated data from 
Greenwood (unpublished). I claculated X for the 2 mm point of John- 
stone et al . (f^ = 18 kHz) for frequencies of 30 k, 24 k, and 20 k. 
Wavelengths were slightly smaller than yours, i.e., 0.25 - 1 mm. The 
reason for doing so was the possibility that the Mossbauer target, 
under the effect of very shor t X , might rock, thus causing the 
extremely steep h. - f. slopes of the tuning curves observed. Since I 
mentioned this possibility in my abstract, I would like to state 
here that I fully agree with you, i.e., that this possibility is very 
remote. The target size 50 pm) is sufficiently small for such 
measurements . 




65 



THE SIGNIFICMCE OF SHEARING DISPLACEMENTS FOR THE MECHANICAL 
STIMULATION OF COCHLEAR HAIR CELLS 

JUERGEN TONNDORF 

Colunibia University, New York 



Hair cells respond only to deflections of their sensory cilia 
caused by tangential, "shearing.", displacements of the membrane covering 
a given sense organ (Steinhausen, 1931 J von Holst, 1950) o 

Cochlear hair cells are displacement receivers (Bekesy, 1951) • The 
displacements of the partition lead to shearing displacements between 
the tectorial membrane and the organ of Corti (B^/sy, 1953a) o For a 
given frequency, two modes of shear occur, each in its separate loca- 
tion (Fig. l): (a) radially-directed in a region nroximal to the 







teclor'ka! 






dindl 



kinis 



Fig. 1: From Bekesy (l953a). 



("vertical") traveling -wave maximum and (b) longitudinally-directed in 
a region beyond that point. Other experiments (Bekesy, 1953a) showed 

that hair cells are direct ionally-sensitive ; orientation of the 
resulting figure-8 response patterns on outer and inner peripheries of 
the tectorial membrane were consistent with the assun: 5 )tion that outer 
hair cells respond to radial shear and inner hair cells to longitudinal 
shear (Fig. 2). 




Fig. 2: From B&esy (195313). 




66 



Toimdorf : SHEARING DISPLACEMENTS 

The directional sensitivity appears to he a general feature of all 
hair cells (Lowenstein & Wersall, 1959)* In extra -cochlear hair cells, 

maximal responses result from ciliary displacements along the stereo - 
cilia/kinocilium axis. Since the hasal hody of cochlear hair cells 
represents the emhrological remnant of a kinocilium (Friedman, 1963)^ 
the interpretation of Bek/sy*s experimental results (cf. Fig. 2) got 
into difficulties. For it is well known that the stereocilia/hasal- 
hody axes of all cochlear hair cells point in the radial direction. 

Outer hair cells respond to radial -shear displacements between the 
organ of Corti and the tectorial membrane to which their cilia are 
attached (Kimura, 1966; Spoendlin, I966; Lim, 1972). Ciliary deflections 
are in phase with basilar membrane displacements. The various models 
proposed (Davis, 1957^ Tonndorf, I96O; Rhode & Geissler, 1966; Billone 
& Raynor, 1973) differ from one another only in some details. 

Figo 3^ Extreme downward position; 
overlay: extreme upward 
position. From Tonndorf 
U970). 

Figure 3 gives this author’s version of radial shear. Note the 
directions of shearing stresses in the organ and in the tectorial mem- 
brane which favor stimulation of outer hair cells. The models of Davis 
(1957) and of Rhode & Geissler (1966) did not include the phase oppo- 
sition on the two sides of the organ. 

The problem of inner-hair cell stimulation received new impetus by 
experimental findings indicating that these latter cells are velocity 
receivers (Dallos eb ^., 1972; Zwislockl & Sokolich, 1973). 

Based upon the histological finding that only the cilia of outer 
hair cells are connected to the tectorial membrane, but those of the 
inner hair cells are not (Lim, 1972), Billone & Raynor (l973) offered 
the following model: The cilia of outer hair cells respond directly to 

shearing displacement;^ those of inner hair cells, however, floating 
freely in the endolytiph, respond to the velocity of a fluid motion that 
is radially -directed — as would be required by their stinictural 
orientation. This model has still some flaws, mainly that it does not 






67 



Tonndorf: SHEARING DISPLACEMENTS 



account for the increase in Q of tuning curves in the conversion from 
hasilar-memhrane displacement to shear (see Fig. 9^ later). 

However, there is some evidence that the cilia of inner hair cells 
might also he connected to the tectorial membrane: (l) On surface prep- 
arations of the tectorial membrane, tufts of hairs from both types of 
hair cells are occasionally found clinging to the underside of that 
membrane (Johnsson & Hawkins, 1973^ P©^s. demonstration to this author; 
Johnsson, 1974, pers. comm.), (2) In an effort to resolve the phase 
controversy alluded to in Fig. 3^ D. J. Lim (l972, unpublished) con- 
ducted the following experiment: He displaced the cochlear partition 

by hydrostatic pressure and fixated it in that position. Figure 4 
shows cilia on the inner and outer hair cells. All of them are 



Fig. 4: Guinea pig 
organ of Corti 
(basal turn); magn. 
333x; osmic acid 
fixation; embedded 
in epon. (From 
Lim, 1972 , tmpub- 
lished. ) 

deflect edy and those of inner and outer hair cells in opposite direc - 
tions. This result could only come about if the cilia of both types 
of hair cells are connected to the tectorial membrane, for free- 

floating cilia could not respond to a dc displacement in this manner. 

— D. J. Lim has not yet published these results himself, since he 
feels that they may have been influenced by histological artifacts. 

In the spatial domain, longitudinal shearing stresses are maximal 
at points of zero displacement of the cochlear partition (Fig. 5)« 

same phase relation ought to exist in the tanporal domain. Indeed, a 
re -check of movie films on intracochlear shear taken earlier in the 






68 



Fig. 5 : Adapted from B&/sy (l 953 a). 



author’s model (Tomidorf, i960), and renewed observations in this model, 
revealed a 90° phase relationship between the two shear modes in the 
region of their overlap (cf . Fig. 7, below), indicating that longi - 
tudinal shear is in phase with the velocity of basilar -membrane motion . 
The same phase relationship had been in^licit in a computer simulation 
of longitudinal shear (Khanna et al., 1968). 

Admittedly, the structural orientation of inner hair cells, in 
conjunction with the findings of Lowenstein & Wersall (l 959 )> makes it 
still difficult to see how inner hair cells are able to respond to 
longitudinal shear. Nevertheless, the following four facts cannot be 
discarded as mere coincidences: (l) the connection of inner hair cell 

cilia to the tectorial membrane (Fig. h); (2) the response of inner 
hair cells to longitudinal shear (Fig. 2 ); (3) the relation of longi - 
tudinal shear to the velocity of basilar -membrane motion (Fig. 5); and 
(^) the fact that inner hair cells are velocity receivers . 

Mechanical cochlear models (Fig. 6) duplicated the events observed 



Tonndorf : SHEARING DISPLACEMENTS 


















Sasffsf At 



Fig. 6: Radial shear in a cochlear 

model (schematic crossection) . 














by B^esy in guinea-pig cochleae very well with respect to both shear 
modes (Tonndorf, 1960). Measurements of the envelopes over (a) the 
traveling wave and (b) over the two resulting shear modes indicated the 
following (Fig. 7 )j (l) Maximal amplitudes of either shear mode were 




69 



Tonndorf : SHEARING BISFLACEmmS 




15-20 dB smaller than that of the original traveling wave. [This phen- 
omenon had been predicted hy Bekesy (l953a). He had referred to it as 
the "second transformer action of the ear", invoking a mechanism 
essentially similar to that of the "curved -membrane principle" of 
Helmholtz (1868)] . (2) The two shear modes occurred in the same spatial 

sequence as in B^esy’s guinea-pig cochleae, except that the point of 
maximal displacement of the longitudinal shear coincided with the maxi- 
mum of the traveling wave, instead of occurring beyond tlie point. 
[However, recent model observations (Tonndorf, 197^# xmpublished) 

showed that at higher frequencies the location of the traveling -wave 
maximum and that of the longitudinal shear became separated in the 
sense of Fig. 1. ] (3) In comparison to the envelope over the traveling 

wave, those over both modes of shear were restricted in length at the 
at the expense of their -proximal slopes . [This fact also agrees, at 
least qualitatively, with Bekesy *s observations (cf. Fig. l).] (k) 

There was a certain overlap between both shear modes at levels somewhat 

below their maxima. [At levels at which B^esy had to carry out his 
observations, such overlap was obviously beyond the limits of optical 

resolution. ] 

Generally speaking, shearing stresses are proportional to the 
curvature of a structure -undergoing a c-urved displacement. Slow -mot ion, 
stroboscopic observations in the model, of which Fig. 8 shows one 
instant, indicated that a given crest (or trough) was originally long 
and pear-shaped, with its blunt end pointing forward. A crossection 




70 



Tonndorf : SHEARING DISPLACEMENTS 



• 8: From Tonndorf (l96o)« 



in this proximal region resembled a bell -shaped curve, and had thiis a 
narrow peak* On account of the general decrease of the travel speed 
with distance, the front of the crest traveled slower than its tail, 
so that it gradually became shorter and wider* (The traveling wave 
represents obviously a volume displacement 0 The extren^ of the latter 
shape was reached in the region of the low -amplitude, short, wavelets 
beyond the point of maximum* Thus, the direction of shear is related 
to the dominant curvature existing in each region * 

However, this cannot be the whole explanation, since the longi- 
tudinal-shear maximum does not occur at the extreme end of the 
traveling -^ave envelope, i*e * . in the region of the shortest wavelets 
where the longitudinally-directed curvature is largest* Since longi- 
tudinal shear is related to the velocity of basilar -membrane motion — 
and thus to the rate of change of the longitudinally-directed curvature 
— the maximum of the longitudinal shear mode must occur somewhere more 

proxlmally, i*e * , at a point where the amplitude — and thus the vel- 
ocity for a given frequency — is still larger. — Jtoreover, the 

dependence of radial shear upon the radially-directed curvature and that 
of longitudinal shear upon the rate of change of the longitudinally- 
directed curvature must aid in the spatial separation of their respective 
envelopes « 

A computer simulation of longitudinal shear (Khanna e^ alo » I968), 
based parfcly upon B^esy*s data (19^T) and partly upon those of Tonndorf 
(i960), showed that in the conversion of basilar -membrane displacement 
to shear the Q of the tuning curves was increased at the expense of their 
low-frequency slopes (Fig* 9)«» It stands to reason that the same changes 

must take place in the radial -shear domain* 





71 



Tonndorf : SHEARING DISPLACEMENTS 




Evans & Wilson (1973) found the lo^ -frequency slopes of mechanical^ 
hasilar -membrane, tuning curves — even in the high-frequency data of 
Johnstone et al. (l970) and of Rhode (l97l) — much lower than those of 

conqpara’ble, neural curves of the cochlear nerve. (The high-frequency 
slopes did not present the same difficulties • ) Evans & Wilson postu- 
lated therefore an intervening mechanism that should affect tuning 
curves like a hi^-pass filter, hut that should also he sensitive to 
damages of various kinds. The present author suggested that this mech- 
anism could he represented hy the conversion to shear in the sense of 
present Fig. 9» The vulnerability of cilia to acoustic overstimulation 
(Lim, 1971; Bredherg ^ al., 1972) and that of the tectorial membrane/ 
ciliary junctions to chemical alterations of endolymph (Tonndorf eb al. . 
1962) could well represent the kind of interference predicted hy Evans 
& Wilson. 

Evans (1973^ pers. comm.) does not accept this suggestion, his 
objection being that the magnitudes of the changes in slope in present 
Fig. 9 0 -'^^ 3iot large enough. 

I would like to comment to this objection in the following way: 

(1) both B^esy’s data (l9^7) and those of Tonndorf (i960) came from 
low-frequency observations j (2) the £ of all curves was taken as con- 
stant. The fact that high-frequency curves have higher §^*s (Johnstone 
& Boyle, 1967; Tonndorf & Khanna, I968) was not readily understood when 
the confutations were made; (3) the difficulty in making quantitative 
coirrparisons between different species is compounded when it comes to 
comparing animal data with those obtained in models. 

The demonstrated close agreement between the cochlear observations 
(Bekesy) and the model findings (Tonndorf) gives confidence in pre- 
dictions derived from these models. It is therefore suggested that 




72 



Tonndorf : SHEARING DISPLACEMENTS 

the shearing conversion fulfills, at least quantitatively, the postu- 
late of Evans & Wilsona At present, no need is seen to invoke other, 

additional mechanism(s) that ■will further affect the Q of timing curves 
in the conversion of mechanical cochlear events into neural responses, 
which is thus in general agreement with I^ller (1972) • 

Note : Supported hy NIH grants 

References 

Bekesy, G. von (l9^7)« Variation of Phase along the Banilar 

Membrane with Sinusoidal Vibrations,” J. Acousto Soc. Amero 19> 

452 - 460 . 

B^/sy, G. von (l95l)» "Microphonics Produced by Touching the Cochlear 
Partition with a Vibrating Electrode,” J. Acoust. Soc. Amer. 23, 
29-35. 

B^^y, G. von ( 1953a). "Description of Some Mechanical Properties of 
the Organ of Corti," J. Acoust. Soc. Amer. 25, 770-TB5 o 
B ek/sy, G. von ( 195 3b). "Shearing Microphonics Produced by Vibrations 
Near the Inner and Outer Hair Cells," J. Acoust. Soc. Amer. 25, 

786 -790 o 

Billone, M. & Raynor, S. (1973). "Transmission of Radial Shear Forces 
to Cochlear Hair Cells," J. Acoust. Soc. Amer. 54, 1143-1156. 
Bredberg, G., Ades, H. W., & Engstrom, H. (l972). "Scanning Electron 

Microscopy in the Normal and Pathologically Altered Organ of Corti," 
Acta Otolaryngo suppl. 301, 3-48. 

Dallos, P., Billone, M. C., Durrant, J. D., Wang, C.-Y., & Raynor, S. 
(1972)0 "Cochlear Inner and Outer Hair Cells: Functional 
Differences," Science 177^ 356-358. 

Davis, H. (1957). In Physiological Triggers and Discontinuous Rate 

Processes. T. H. Bullock (ed), Washington: Amer. Physiological Soco 
Evans, (l973). Personal Communication. 

Evans, E. F. & Wilson, Jo J*. (l973). **The Frequency Selectivity of the 
Cochlea," In Basic Mechanism in Hearing. Ao R. Jailer (ed). 

Academic Press, New York. 

Friedman, J. A. (1963). Cytologie de I’oreille on Microscope 
Electrique," Triangle (J. Sandoz) 6, 74® 




73 



Toimdorf : SHEARING DISPLACEMENTS 

Helmholtz, H. (l868). "Die Mschanik der Gehorknochelchen imd des 
Trommelfells, " Pflugers Archiv 1, l-6o. 

Holst, Eo von (1950)* "Die Tabigkeit des Statolithenapparates im 
Wirheltierlahyrinth, " Naturwiss. 37^ 265-272. 

Johnsson, L. G. (l97^)* Personal Comniunication. 

Johnsson, L. G. Ss Hawkins, J. W. (l973)* Personal Communication. 
Johnstone, B. M. & Boyle, A. J. F. (1967). **Basilar Membrane Vibration 
Examined with the Mossbauer Technique," Science 158, 389"*390« 
Johnstone, B. M., Taylor, K. J., & BoJ^e, A. J. F. (l970). "Mechanics 
of the Guinea Pig Cochlea," J. Acoust. Soc. Amer. kj, 50^-509* 

Khaona, S. M., Sears, R. E,, & Tonndorf, J. (1968). "Some Propeities of 
Longitudinal Shear Waves: A Study by Computer Simulation," J. Acoust. 
Soc. Amer. 43, 1077-1084. 

Kimura, R. S. (1966). "Hairs of the Cochlear Sensory Cells and Their 
Attachment to the Tectorial Membrane," Acta Ofcolaryng. 61, 55-72. 

Lim, D. J. (1971) • "Acoustic Damage of the Cochlea," Arch. Obolaiyng. 

94, 294-305. 

Lim, D. J. (1972). "Fine Morphology of the Tectorial Membrane, Its Re- 
lationship to the Organ of Corti," Arch. Otolaryng. 96, 199-215. 

Lim, D. J. (1972). Unpublished Data. 

Lowenstein, 0., & Wersall, J. (1959) « Functional Interpretation of 
the Electron Microscopic Structure of the Sensory Hairs in the 
Crista of the elasmobranch. Raia clavata in terms of Directional 

Sensitivity." Nature (London) l84, I807-I8IO0 
Biller, A. R. (1972). "Coding of Sounds in Lower Levels of the Auditory 
System," Quart. Rev. Biophys. 5^ 59-155 . 

Rhode, W. S. (1971). "Observations of the Vibration of the Basilar 

If 

Membrane in Squirrel Monkeys Using the Mossbauer Technique, 

J. Acoust. Soc. Amer. 49, 1218-1231. 

Rhode, W. S. & Geisler, C. D. (1966). **Model of the Displacement Between 
Opposing Points on the Tectorial Membrane and Reticular Lamina," J. 
Acoust. Soc. Amer. 42, I85-I9O. 

Spoendlin, H. (1966). The Organization of the Cochlear Recentor> Vol. 13 
of Advances in Ofco -Rhino -Laryng., L. Ruedi (ed), Karger, Basel. 




74 



Tonndorf : SHEARING DISPLACEMENTS 



Steinhaxisen, W. (l93l)« ”Uber den Nachweis der Bewegung der Cupula in 

ft 

der intaJcfcen Bogengangsajnpulle des Labyrinths Lei der naturlichen 
rotatorischen und kalorischen Reizung,” Pflugers Arch, ges Physiol. 
228, 322. 

Tonndorf, J. (i960). "Shearing Motion in Scala Media of Cochlear 
Models," J. Acoust. Soc. Amer. 32, 238 -2Mf-. 

Tonndorf, J. (l970). "Cochlear Mechanics and Ifydrodynaraics," In 
Foxjndations of Modem Auditory Theory. Vol. I, J. Tohias (ed). 
Academic Press, New York. 

Tonndorf, J. (l97^)» Unpublished Data. 

Tonndorf, J., Duvall, A. J. Ill, and Reneau, J. t*. (1962). ’^Permeability 
of Intracochlear !4embranes to Various Vital Stains," Ann. Ofco-Rhino- 
Laryng. 71> 8OI-839. 

Tonndorf, J., & Khanna, S. M. (1968). "Displacement Pattern of the 

Basilar Membrane: A Comparison of Experimental Data," Science 160, 

1139 - 11 ^ 0 . 

Zwislocki, J. J. & Sokolich, W. G* (1973 )• "Velocity and Displacoaent 
Response in Auditory-Nerve Fibers," Science 182, 6ij--66. 




75 



COMMENTS ON: "The significance of shearing displacements for the 
mechanical stimulation of cochlear hair cells" (J. TONNDORF) 

P. DALLOS 

Auditory Physiology Laboratory, Northwestern University, Evanston, 111. 

The purpose of this Comment is to clarify a common misconception per- 
taining to von Bekesy*s data (1953) obtained with the vibrating elect- 
rode. Some of these data serve as an important foundation to Dr. Tonn- 
dorf*s scheme of cochlear function. We have demonstrated that the 
microphonic (CM) output of inner hair cells is significantly (30-40 dB) 
below that of the outer hair cells (e.g. Dallos, 1973). This means that 
the CM measured from a normal ear is completely dominated by the outputs 
of outer hair cells. As a consequence, all of von Bekesy*s studies 
utilizing the vibrating electrode reflect the output properties of 
outer hair cells. Specifically, those results imply that outer hair 
cells are excited in proportion to basilar membrane displacement, and 
that the apparent directional sensitivity shown in Dr. Tonndorf*s 
Fig. 2 (von Bekesy, i960, p.707) reflects the changing excitation of 
outer hair cells depending on the locus of stimulus application. Thus 
there is no experimental evidence that would suggest that the inner 
hair cells are primarily sensitive to longitudinal shear. 

It might also be worth mentioning that while there is some conflict- 
ing evidence, the preponderance of convincing experimental observations 
indicate that inner hair cell cilia not normally contact the tecto- 
rial membrane. Figure 4 reflects an isolated observation and Dr. Lim 
is rather dubious about its general validity (private communication, 
1974). 

References 

Bekesy, G. von (1953) "Shearing Microphonics Produced by Vibrations 
Near the Inner and Outer Hair Cells," J. Acoust. Soc. Amer. 25, 786-790. 

Bekesy, G. von (i960) Experiments in Hearing , (Mc-Graw Hill, New York.) 

Dallos, P. (1973) "Cochlear Potentials and Cochlear Mechanics in 
Basic Mechanisms of Hearing (A.M^ller, Ed., Academic, New York) 355-372. 




76 



ADDITIONAL REMARKS 



TONNDORF : In the experiment in question, Bekesy (1953) used local 

mechanical excitation whereas Dallos employed acoustic stimulation. There- 
fore, neither Dallos^ 30-40 dB difference between the outputs of the two 
types of hair cells can be directly applied to Bekesy ’s experimental 
situation, nor can the demonstrated longitudinal sensitivity of i.h.c. be 
said to "reflect the changing excitation of o.h.c. depending upon the locus 
of stimulus application." Moreover, the results of Kiang, Moxon, and 
Levine (1970) suggest th^t the magnitude of the difference in sensitivity 
between the two types of cells is, at most, only 20 dB. 

WILSON: I disagree that it is possible to explain the differences between 

basilar membrane displacement and neural tunings by means of the' shearing 
displacement mechanisms proposed by Dr. Tonndorf. Firstly, the longitudinal 
shearing component cannot be sufficiently sharp to account for an increase 
in low frequency slope from zero for the basilar membrane (low pass 
characteristic at constant SPL)to 100-200 dB/octave (round figures from 
the data of Kiang et al . , 1965; Evans, 1972). A differentiated response 

would give only an increase of 6 dB/octave, and the longitudinal shear 
component cannot be great when the shortest wave length in the traveling 
wave is 0.7-2 mm compared with radial shearing with a half-wave length 
across the basilar membrane of 0.15 mm( Wilson, this volume). Secondly, 
the radial shearing component for the outer hair cells cannot be signifi- 
cantly sharper than the basilar membrane displacement pattern in view of 
the excellent agreement with differential cochlear microphonic responses 
(see Wislon, this volume. Fig. 2). The high frequency slopes observed in 
the microphonics, although much less than for the basilar membrane, in- 
dicate that the method would measure much lower frequency slopes accurate- 
ly. No such differences in low frequency slope occur. 

TONNDORF: (1) The slope of the l.f. portion of tuning curves is a matter 

of definition. If one chooses as reference constant SPL at the footplate, 
as Wilson does, the slope approaches indeed zero. However, for constant 
stapes displacement, it is approximately + 12 dB/oct. , and I think this 
can be made considerably sharper by shearing transformation. (2) That 
there is longitudinal shear is an experimental fact . Also, one should 
not forget the exchange between displacement (or velocity) and force in 
this respect. (3) The interpretation of CM results is notoriously difficult 
because of the summed responses from hundreds of hair cells, even for 
diff. electrodes. 




77 



ENLARGED HYDROMECHANICAL COCHLEA MODEL WITH BASILAR MEMBRANE 
AND TECTORIAL MEMBRANE 

R. HELLE 

Institut fur Elektroakustik, Technische Universitat MUnchen, FRG 

1. Introduction: The formation of the traveling wave on the basilar mem- 
brane in the cochlea, produced by the motion of the stapes is considered to 
be sufficiently known according to observations made at the "outer side" of 
scala media by v. BfKfSY (1960), JOHNSTONE and BOYLE (1967), WILSON and 
JOHNSTONE (1972), KOHLLOFFEL (1972). The final processes within the scala 
media creating the adequate stimulus for the hair cells are still obscure 
(FLOCK (1971)), because it is extremely difficult to investigate vibrations 
within the scala media. Therefore an enlarged hydromechanical model 
simulating the full length of the cochlea has been built (cf. HELLE (1974)) 
including the tectorial membrane, which is thought to play a crucial role 
in the final stimulation of the hair cells in the cochlea. 

2. Description of the model: The straight model is enlarged by a factor of 
30 in comparison with the length of the human cochlea, the frequency range 
is reduced by a factor of 30. Between the two scalae of uniform cross sec- 
tion ( Plexiglas) filled with a glycerol solution (viscosity about 30-times 
that of water) a stiff partition (OL) is bearing the imitation of the sen- 
sory organ (BM, CO, TM), cf. Fig. 1. 

The basilar membrane (BM) and the organ of Corti (CO) are cast from 
the same soft silicone rubber and glued without tension over the tapering 
slot in the stiff partition (OL). The volume elasticity achieved, at least 
in the middle part of the model, is in sufficient agreement with v.B^KESY's 
(1960) data for the natural cochlea. The tectorial membrane is cast from 
softer, transparent but impermeable silicone rubber. 

The gap between the tectorial membrane and the organ of corti permits 
both relative movements and flow of fluid from the inner sulcus (IS)through 
the gap to the volume outside. The omission of Reissner's membrane is as- 
sumed to be justified, cf. v. BEKESY (1960), ZWICKER (1974 b). 

This way, the model contains the most important parts of the hydro- 
mechanic system not only of the cochlea but of the scala media too, and can 
help, especially by observing the gap between the organ of Corti and the 
tectorial membrane, to learn more about the mechanisms proposed as adequate 





78 



Helle: HYDROMECHANICAL COCHLEA MODEL WITH TECTORIAL MEMBRANE 




Fig. 1 : Top - left: Cross section of the cochlea (apex) of the squirrel mon- 
key, schematic cf. ZWICKER (1972). Top - right: Cross section AA* of the 
model near the "helicotrema" . Bottom: Longitudinal section CC’ of the model. 
Ahhreviations : SV: scala Testibuli ,RM: Reissner*s membrane, SL: spiral lim- 
bus, TM: tectorial membrane, G: gap between CO and TM, CO: organ of Corti , 

IS: inner sulcus, OL: osseus spiral lamina, BM: basilar membrane, ST: scala 
tympani, HT: helicotrema, OW: Oval Window, RW: Round Window, xq-^i displace- 
ment of the Oval Window (of the stapes). 

stimulus, cf. NEUBERT (1949), v. B^K^SY (1960), ZWICKER (1972, 1974 a). 

For all measurements reported, the displacement of the Oval Window 
(OW) was impressed. L^.^. is the level of this displacement re lO'^mm^^^. 
Considering the volume displacement of the Oval Window and Fig. 6-43 by 
V. BEKESY (1960) L^^=0 dB at f = 70 Hz in the model is equivalent to a SPL 
of 125 dB at 2100 Hz in nature. 

3. Traveling waves on the membranes: When the Oval Window is displaced si- 
nusoidally, the displacement-traveling wave (in y-direction) on the basilar 
membrane (and the organ of Corti) corresponds to the observations by 
V. BEKESY (1960) even if the tectorial membrane is incorporated in the mo- 
del. On the tectorial membrane a traveling wave can be observed too, having 
similar phase and slightly smaller amplitude than the wave on the basilar 
membrane with organ of Corti. The place of maximum vibration depends on 
frequency in accordance with the well known frequency-place principle. 




79 



Helle: HYDROMECHANICAL COCHLEA MODEL WITH TECTORIAL MEMBRANE 




Fig>2 : Top: Absolute value of 
the displacement of the outer 
edges of the tectorial mem- 
brane, y-p and the organ of 
Corti, Yq as a function of the 
phase angle ^ within the period, 

Ls-t-=+6 dB, -6 dB. Bottom: Width 
h of the gap. x = 525 mm, f = 

60 Hz. 



4. Width of the gap between organ of Corti 
and tectorial membrane: The upper part of 
Fig. 2 shows the motion of the organ of 
Corti, Yq and the tectorial membrane, yj 
as a function of the phase angle within 
ohne period (d<P>0, phase lag to stapes 
increases). The lower part of Fig. 2 shows 
the width h of the gap. 

At L^^ = - 6 dB the width h is nearly 
constant during the period and approxima- 
tely equal to the width when no signal is 
applied. Raising the level to L 5 ^ = 6 dB 
causes a remarkable increase in the 
average width of the gap. At the 
same time the width is modulated with the 
period of the driving signal. The upper 
part of Fig. 2 reveals, that a shifting of 
the tectorial membrane towards seal a ves- 
tibuli has primarily caused the increase 
in the average width of the gap. There 
is a large area in the model extending 
from the beginning at the Oval Window un- 
til at least the place of maximum vibra- 
tion, where the gap shows the same tenden- 
cy to widen with increasing level L 5 ^. 



5. Time-dependent variation of the average 
width of the gap: When rapidly switching 
the driving signal at high levels (as ap- 
plied in section 4), the average width of 
the gap between the organ of Corti and the tectorial membrane followed with 
a certain delay. 

The inset in Fig. 3 (bottom, right) shows the temporal envelope X 5 -|- ^ 
of the amplitude modulated displacement of the stapes. The durations T^ and 
Tpcould be varied independently, h^^^* represented by thick lines and solid 
symbols, is the value of the average width at the moment "2" (in- 





80 



Helle: HYDROMECHANICAL COCHLEA MODEL WITH TECTORIAL MEMBRANE 



stant indicated by an arrow in the right inset, average width indicated by 
left inset), is the value of the average width at the moment "1" (thin 
lines and open symbols in top figure). 




tTp^5,5s] 



T„*L 



Fig;. 3: Top: Average 
■width hjji'i 9 hjjj 2 of the 
gap as a fimction of the 
durations - tg,, 

Tp + tg^. 

Bottom - right: temporal 
envelope xg^ p of the am- 
plitude modulated stapes 
displacement (carrier fre- 
quency f omitted in the 
figure ) 9 Lst = +6 dB for 
the duration T^ - 2 t^s 
Lst = -6 dB for the du- 
ration Tp. 

Bottom - left: average 
■width (schematic), x = 

360 mm 9 f = TO Hz. 






In a first experiment the opening of the gap was investigated (triangles 9 
solid lines in Fig. 3). The time Tp >5.5 s is long enough for the gap to 
reach it's steady state value h^j^ of about o.35 mm at the moment "1". If 
the duration T^ is decreased 9 the steady state value •^m2 (at the moment "2" 
of about 0.8 mm) is no longer attained as soon as T^ - tg:50.5 s. 

The second experiment was directed to study the closing of the gap 
(circles9 dashed lines in Fig. 3). The duration T^ = 1.45 s is long enough 
for the gap to achieve its steady state value h^2 °f about 0.8 mm at the 
moment "2". If Tp is decreased, starting at about 20 s, the steady state 
value at the instant "1" is no longer attained as soon as Tp + tg^^2.0s. 
The closing was studied at several places within the model at f = 60 Hz, 

70 Hz, 90 Hz and the duration Tp necessary to attain the steady state value 
hj^l at the moment "1" when was reduced was always found to be about 1 s 
(corresponding to about 30 ms in the natural ear). That is distinctly lon- 
ger than the transient period of the basilar membrane vibration! 



6. Increase in selectivity: Whereas the widening of the gap at high levels 
occured in an extended region of the model, there was a relatively small 
area where the gap became narrower or almost completely closed. 





81 



Helle: HYDROMECHANICAL COCHLEA MODEL WITH TECTORIAL MEMBRANE 

The inset on top of Fig. 4 shows schema- 
tically what happened in the model. As 
soon as L^^ was raised upon a certain 
critical value (about 0 dB to 10 dB) the 
oscillating tectorial membrane suddenly 
dropped on the oscillating organ of 
Corti 5 thus closing the gap almost com- 
pletely. The extension of this area for 
a given frequency f (ordinate) at a cer- 
tain place X (abscissa) is represented 
in Fig. 4 by the hatched area between 
the dashed lines. With the gap already 
closed, the level L^^ could be lowered 
slightly. In this case, the extension of 
the closed gap was reduced to the small 
area between the solid lines before 
opening again. 

In comparison with the length in x-di- 
rection where the envelope of the dis- 
placement-traveling wave exceeds \l\fT - times its maximum value (shown for 
the basilar membrane at 60 Hz in Fig. 4), the extension of the closed gap 
is remarkably smaller. The area where the gap is closed shifts towards the 
Oval Window if the frequency is raised, thus showing good agreement with 
the frequency-place principle oberved for the membrane displacement. 

The closing of the gap is considered as increase in selectivity, ori- 
ginating within the hydromechanical system of the cochlea model. 

7. Fluid motion inside the inner sulcus: The motion of the fluid was in- 
vestigated by suspended aluminium dust particles. Particle transport 
through the gap could be observed but was difficult to measure. Particle 
movement inside the inner sulcus was investigated by determinig the velo- 
city Vy of the particle transport. Within the range reported in Fig. 5 the 
transport was directed towards the helicotrema (cf. inset on top of Fig. 5). 
The velocity Vy in x-di recti on, measured in the centre of the inner sulcus, 
reaches its maximum value at a place where the envelope of the traveling 
wave already decreases towards the helicotrema. 




Fig.L: Increase in selectivity by 
closing the gap between tectorial 
membrane and the organ of Corti. 
(Experiment carried out at a ver- 
sion of the model slightly dif- 
ferent from that used for the 
measurements reported in Fig’s2, 
3, 5.) 





82 



Helle: HYDROMECHANICAL COCHLEA MODEL WITH TECTORIAL MEMBRANE 




According to the experiments, the partic- 
le transport was in the same direction 
within the entire cross section (in the 
y-z-plane, cf. Fig. 1) of the inner sul- 
cus. Therefore, v-j- can be considered to 
be a value proportional to the average 
velocity of the volume flow (in x-di rec- 
ti on) through the inner sulcus. The var- 
iation of this volume flow as a func- 
tion of X, deduced from the dependence 
of v-j- and of the cross section of the in- 
ner sulcus on X, leads to the conclusion, 
that there is flow of fluid out of 
the gap concentrating around x = 600 mm, 
where the descending slope of Vj is 
steepest, cf. ZWICKER's (1974 a) obser- 
vations. At places x:<350 mm flow of 
fluid is directed into the gap, 
cf. STEELE'S (1973) calculations. 



Fi^.3: Velocity v«p of the parti- 
cle transport towards the heli- 
cotrema inside the inner sulcus 
(solid circles and lines) at 
three different displacement 
levels Lsf Bashed lines: Enve- 
lope of the displacement -trave- 
ling wave on the basilar mem- 
brane shifted to equal maximum 
value as the associated func- 
tion Viji. 



8. Summary and conclusions: The displace- 
ment-traveling wave on the basilar mem- 
brane (and the organ of Corti) and the 
fluid motion within seal a vestibuli are 
in sufficient agreement with the findings 
of V. BfKfSY (1960) and TONNDORF (1970). 
Thus, the additional incorporation of the 
tectorial membrane in the model has not 



changed fundamentally the principles. The model therefore is qualified for 
investigating the influence of the tectorial membrane, on which a displace- 
ment-traveling wave develops similar to that on the basilar membrane. 



At high levels of the driving signal the average width of the 
gap between tectorial membrane and the organ of Corti depends on the level 
of the oscillating displacement of the stapes. Changing the width of the gap 
is accomplished primarily by shifting the tectorial membrane, thus indica- 
ting that directed forces exceeding the elastic restoring forces of the tec- 
torial membrane have developed. As the compliance of this membrane in the 





83 



Helle: HYDROMECHANICAL COCHLEA MODEL WITH TECTORIAL MEMBRANE 

model was smaller than it ought to be according to its natural value, the 
effects observed in the models at high levels only can originate in the na- 
tural ear at SPL's within the normal hearing range. 

Changing the average width of the gap between the organ of Corti and 
the tectorial membrane may serve in the natural cochlea to alter (by non- 
linearities) those dimensions of the sensory organ, which are important in 
the processes leading to the displacement of the sensory hairs. If flow of 
fluid in the gap is assumed to cause this displacement, the model experi- 
ments lead to the following conclusions: Changing the average width of the 
gap can result in an adjustment of the sensitivity of the sensory organ, the 
directed flow through the gap may cause a directed displacement of the sen- 
sory hairs, increasing or decreasing the sensitivity of each cell depending 
on its place relative to the place of maximum vibration of the basilar mem- 
brane, the final excitation of the cell being (phase-locked) triggered by 
the superimposed oscillating flow. Therefore the hydromechanical system of 
the cochlea is supposed to perform an increase in selectivity compared to 
the basilar membrane motion. In addition, the delay of the average width of 
the gap when following a change of the amplitude of the driving signal may 
be related to the critical duration of about 20 ms found in psychoacoustic 
experiments, cf. ZWICKER (1973). 



ACKNOWLEDGEMENT: This research was carried out within the Sonderforschungs- 
bereich Kybernetik MUnchen, supported by the Deutsche Forschungsgemein- 
schaft. 



References 



von BfiKfiSY, G. (i960). "Experiments in Hearing". (Me Graw- Hill Book Company, 
New York) . 

FLOCK, A. (19T1)» "Sensory transduction in hair cells", in "Handbook of sen- 
sory physiology". Autrum, H. , Jung, R. , Loewenstein, W.R. , McKay, D.M. , 
Teuher, H.L. , Eds. (Springer Verlag Berlin) Vol. I., Chapter lL. 

HELLE, R. (19TL). "Selektivitatssteigerung in einem hydromechanischen Innen- 
ohrmodell mit Basilar- und Deckmemhran" . Acustica 31 (in press). 

JOHNSTONE, B.M. and BOYLE, A.J.F. (196T)* "Basilar membrane vibration exami- 
ned with the Mossbauer technique". Science I 58 , 389 - 390. 




84 



Helle: HYDROMECHANICAL COCHLEA MODEL WITH TECTORIAL MEMBRANE 

KOHLLOFFEL, L.U.E. (1972). "A study of basilar membrane vibrations I. Fuzzy- 
ness-Detection: A new method for the analysis of microvibrations with laser 
light”. Acustica 27 9 L9 - 65. 

NEUBERT, K. (I9L9). "Die Basilarmembran des Menschen und ihr Verankerungs- 
system". Z. Anat. 1lL, 5L0 - 588. 

STEELE, C.R. (1973). "A possibility for sub-tectorial membrane fluid 
motion", in "Basic Mechanisms in Hearing", M0LLER, A.R. , Ed. (Academic 
Press, New York) Chapter I, 69 “ 90. 

TONNDORF, J. (1970). "Cochlear mechanics and hydro-dynamics", in "Founda- 
tions of modern auditory theory", TOBIAS, J.V. Ed., (Academic Press, 

New York) Chapter 6, 205 - 25U. 

WILSON, J.P. and JOHNSTONE, J.R. (1972). "Capacitive probe measures of 
basilar membrane vibration". Symposium on Hearing Theory. (IPO Eindhoven, 
Holland) . 

ZWICKER, E. (1972). "Investigation of the inner ear of the domestic pig and 
the squirrel monkey with special regard to the hydromechanics of the coch- 
lear duct". Symposium on Hearing Theory. (Ipo Eindhoven, Holland). 

ZWICKER, E. (1973). "Temporal effects in psychoac oust i cal excitation", in 
"Basic Mechanisms in Hearing", M0LLER, A.R. , Ed., (Academic Press, New York) 
Chapter 6, 809 - 827- 

ZWICKER, E. (l9tLa). "Ein hydromechanisches Ausschnittmodell des Innenohres 
zur Erforschung des adaquaten Reizes der Sinneszellen" . Acustica 31 9 (in 
press ) . 

ZWICKER, E. (l97Lb). "tiber die Viskositat der Lymphe im Innenohr des Haus- 
schweines", Acta Oto-Laryng. (Stockh.), (in press). 



ADDITIONAL REMARKS 

LEGOUIX: It is interesting to note that the gap between the tectorial 

membrane and the organ of Corti, as Mr. Helle described, shows some 
analogies with the classical properties of summating potential. Further- 
more, it occurs at a particular locus according to frequency, and there 
is a delay before it reaches a steady state. It is assumed that it 
increases the selectivity of the cochlear filter. The same role has 
been hypothesized also for SP. Some further investigation into this 
analogy seems to be worthwhile. 

HELLE: I am glad to see that the experiments concerning the change of 
the average width of the gap between the organ of Corti and the tectorial 
membrane in the model can be associated to physiological facts observed 
in the summating potential. 

TONNDORF: (1) A fluid viscosity of only 30 x that of water is, in my 
opinion, too small for your size model. I think this is important with 
regard to fluid velocities and also if, as you seem to suggest, the 
sensory hairs are not attached to the tectorial membrane. (2) Your Fig. 2 
(lower part) shows almost a phase reversal with a change in SPL of 12 dB. 
To me this suggests that you were working close to the borderline between 
linearity and nonlinearity. 




85 



ADDITIONAL REMARKS 

HELLE: (1) According to Tonndorf (1970) , p. 208, equation (6-3) the product 

2 

p f d 

H = Q 

y 

( p = density, f = frequency, d = linear dimension, p = viscosity, and 
c = constant) has to be kept constant in experimental models. 

The density p in the model has the same value as in nature. If the linear 
dimension is enlarged by a factor of 30 and frequency reduced by the same 
factor (cf. Chapter 2 of the paper), it follows that viscosity p has to be 
enlarged by the same factor as the linear dimension, the viscosity then 
being 30 times that of water. (2) The change itself of the average width 
of the gap between the organ of Corti and the tectorial membrane is a 
highly nonlinear phenomenon, occuring with sinusoidal displacement of the 
stapes (cf. Chapter 8 of the paper). Still there is a region extending 
from the beginning of the membrane (near x = 0 mm) to a place just before 
the traveling wave reaches its maximum value, where the gap opens and 
where the oscillating displacement of both organ of Corti and tectorial 
behaves linearly as a function of the displacement level of the stapes 
after the gap has reached its steady state value (cf. Fig. 3). Never- 
theless, at the place x where the results of Fig. 2 have been taken, the 
phase velocity of the traveling wave just starts to depend on the level of 
the driving signal. That can be related to experiments by v. Bekesy 
(1960) and Tonndorf (1970). As their models did not contain a tectorial 
membrane , the phase shift derived from the modulation of the width of the 
gap (Fig. 2 lower part) cannot be compared directly with results of their 
experiments . 

WILSON : In your model you show that a narrow section between the tectorial 
and basilar membranes closed up and you suggest that a high fluid streaming 
velocity in this region will stimulate the hair cells. Firstly, would not 
fluid choose an easier path and pass either side of this region, resulting 
in less velocity stimulation? Secondly, would it not be more reasonable to 
suppose that the function of the closure might be to bring the tectorial 
membrane into contact with the stereocilia of the inner hair cells resulting 
in conventional shearing forces? The restricted region of contact would, 
of course, still give the advantages you mention in your model of a more 
selective response than that related to the traveling wave envelope. 

HELLE: The main point of Fig. 4 is that in a relatively restricted area 
along the cochlear partition that dimension of the sensory organ (the aver- 
age width of the gap between the organ of Corti and the tectorial membrane) 
changed which is the most important for the creation of the adequate stimu- 
lus of the hair cells. This change of the average width of the gap can be 
effective for conventional shearing forces as Dr. Wilson pointed out. The 
effect of fluid motion within the gap on deflecting the sensory hairs de- 
pends on the actual profile of the fluid velocity within the gap. If 
volume displacement is thought to be impressed within the inner sulcus the 
effect described in Fig. 4 may be effective for stimulation by fluid motion 
as well. 




86 



A MODEL FOR MECHANICAL TO NEURAL TRANSDUCTION IN THE 
AUDITORY RECEPTOR* 

M. R. SCHROEBER 

Drittes Physlkalisches Institute UniversitSt Gottingen, 

FRG 

J. L. HALL 

Bell Laboratories, Murray Hill, New Jersey 0797^, U.S.A. 

I . INTRODUCTION 

We present here a model for the transduction of 
mechanical motion of the basilar membrane into action 
potentials (**spikes") in the auditory nerve. The model, 
based on the generation and depletion of electrochemical 
quanta in a hypothetical hair cell, is more compatible 
with neurophysiological evidence than threshold models of 
mechanlcal-to-neural transduction. The amplitude normali- 
zation observed in auditory-nerve data is an Inherent 
property of the model. 

II. THE MODEL: DEFINITION AND PROPERTIES 

The model is defined by the following three rules 

1. "Quanta" of ah (electrochemical) agent are 
generated in the hair cell at a fixed average rate, 

r quanta/sec. 

2. Quanta disappear and cause an attached 
(afferent) nerve fiber to fire with a probability per unit 
time proportional to their number, n(t), and a "permea- 
bility" function, p(t), related to the input stimulus. 



* Preliminary reports on this material were presented 
at the 85 th meeting of the Acoustical Society of 

America, April 1973 (Schroeder & Hall, 1973; Hall & 
Schroeder, 1973)* 




87 



Schroeder and Hall: HAIR CELL MODEL 



3 . In addition, quanta disappear Independently 
of stimulation and without causing nerve firings with a 
probability per unit time equal to g*n(t), where g Is a 
constant with dimension sec”^. 

For the permeability function we assume the 
following "soft half-wave rectifier" law, 

p(t) = Pq{%s ( t ) + [3t;s^ ( t )+l]^} . Here, p^ Is a constant with 

dimension sec ^ related to the spontaneous firing rate of 

the nerve. The function s(t) corresponds to the mechanical 

stimulation signal (related essentially linearly to the 

motion of the basilar membrane at the place of the hair 

2 

cell). s(t) Is normalized such that s (t) = 1 corresponds 

2 

to a sound level of approximately 30 dB 0.0002 dyn/cm . 

The firing probability per unit time f(t) Is 
given by (cf. Rule 2) 

f (t ) = n(t ) -p(t ) . (1) 

For sufficiently high fundamental frequency of a 
stimulating periodic signal, the number of quanta will 
remain relatively constant during the fundamental period. 
Thus, the firing probability per unit time, Eq. 1, can be 
approximated by 

f(t) = np(t) . (la) 

Hence, the firing probability Is approximately proportional 
to p(t). For small signal amplitudes, p(t)-p Itself Is 
proportional to the signal. For large signal amplitudes, 
p(t) Is proportional to the half-wave rectified signal. 

The firing probability can be shown to be asymptotically 
Independent of signal amplitude: f(t) -> rp(t)/p. This 

"normalization" of the firing probability Is one of the 
outstanding characteristics observed In period histograms 
and follows here directly from the Rules that specify the 
model. 

Another neurophyslologlcally-observed property that 
Is correctly predicted by Eq. la Is the (for positive 




88 



Schroeder and Hall: HAIR CELL MODEL 



signal values) essentially true waveform reproduction of 
the firing probability — Including Its phase stability with 
signal amplitude (Rose et al., 1971, Pig. 11). This property 
of nerve activity Is difficult to explain by threshold models 
(Weiss, 1966). 

RC-Clrcult Representation of Average Behavior 

It Is Interesting to note the average behavior of 
the model can be represented by the simple RC circuit shown 
In Pig. 1: a current generator with a fixed shunt conduc- 

tance g, a shunt capacitance C, and a variable conductance 

p(t) . 

The source current r 
corresponds to the rate of 
generation of quanta, and 
the current f(t) through 
the variable conductance 
represents the firing 
probability of the nerve. 

The charge on the capacitor 
C corresponds to the average 
number of quanta In the 
hair cell. 

Many properties of the 
model. Including those 
already discussed, can be 
visualized with the aid of the electrical circuit of Pig. 1. 

As an example, consider the average firing probabilities for 
tone bursts (corresponding to the envelopes of PST 
histograms). It Is apparent from the charging and dls- 
cheirglng time constants of the capacitor that the model 
shows the familiar Initial overshoot, the suppression of the 
firing rate below the steady-state spontaneous rate after 
the cessation of the stimulating signal, and the slow 
recovery to the spontaneous firing rate following the burst. 



CURRENT -FIRING 
PROBABILITY 




CHARGE A NUMBER OF QUANTA 



Pig. 1 - Equivalent RC-clrcult 
approximation for average model 
behavior. 




89 



Schroeder and Hall: HAIR CELL MODEL 



Distribution of Intervals 

An Important neurophysiological measurement Is 
that of "Interval histograms" reflecting the probabilities 
of occurrence of different length Intervals between suc- 
cessive "spikes". Mathematical analysis by Logan and 
Shepp (1973) of a discrete-time version of our model for 
no input signal (which then forms a Markov process) shows 
that Intervals are geometrically distributed. This result 
Is In agreement with available neural spike data, except 
for refractory effects (Molnar and Pfeiffer, 1968). 

III. DIGITAL SIMULATION OP THE MODEL WITH RECOVERY FUNCTION 
In this section we present results of computer 
simulations in which an attempt Is made to model refractori- 
ness as well as the properties of the hair cell model. The 
refractory effects seen In nerve spike data can be included 
by making the firing probability f(t) depend In part on 
time elapsed since the preceding neural event. Instead of 
Eq. 1, we have 

f(t) = n(t) -p(t) *p(t-tQ) , (lb) 

where p(t-t^) is a recovery function and t^ Is the time at 
which the preceding neural event occurred. For the simula- 
tions reported here, p(t-t^) was such that there was a dead 
time of 1 msec followed by an exponential recovery with a 
time constant of 1 msec. 

Continuous Sinusoid Input 

Figure 2 shows the average firing rate as a func- 
tion of stimulus intensity for sine-wave Inputs of 125 Hz, 

1 kHz, and 4 kHz. This figure can be compared to physio- 
logical results (Kiang et al., 1965, Pig. 6.10^ Rose et al., 
1971j Pig. 9). Results at 1 and 4 kHz are similar: The 

average firing rate Increases from about 50/sec to about 
150 /sec as stimulus intensity Is Increased. At low 
frequencies, for example at 125 Hz shown here, the growth 
function Is less steep and saturates at a lower firing rate. 




90 



Schroeder and Hall: HAIR CELL MODEL 




Pig. 2 - Average firing rate in response to continuous 
tones of 125 Hz, 1 kHz, and 4 kHz. 

Period histograms in response to a continuous 
1-kHz tone are shown in Pig. 3- The time origin is at the 
excitatory-going zero crossing of the stimulus, the bln 
width is 50 ysec, and 50 sec of data are represented. Por 
stimulus intensities up to 30 dB above reference level the 
response is approximately sinusoidal, while at higher 
intensities the response becomes a half-wave rectified 
version of the stimulus. The stimulus waveform is preserved 
during the excitatory half-cycle except at very high intensi- 
ties where the histogram is skewed to the left. A similar 
effect has been observed in physiological data (Gray, 1967). 
This high-intensity distortion occurs in the model because 
more quanta are available at the beginning of the excitatory 
half-cycle than at the end, and it becomes more pronounced 
for low-frequency stimuli. 




91 



Schroeder and Hall: HAIR CELL MODEL 




Fig. 3 - Firing probability for one period of a 
continuous 1-kHz tone. 



Complex Tone Input 

Figure 4 shows period histograms in response to 
a complex input with components at 1 and 2 kHz. The phase 



3000 r 



INTENSITY A (dB RE REF) 

A-SIN (27Tf,t) -2A-C0S(27Tf2t) 
ft = I KHz 
fa =2 KHz 




0.1 0.2 0.3 0.4 03 0.6 0.7 0.8 0.9 IX) 

TIME (msec) 

Pig. it - Firing probability for one cycle of a 
complex Input with components at 1 and 2 kHz . 




92 



Schroeder and Hall: HAIR CELL MODEL 



and Intensity relations are such that there are two excita- 
tory stimulus peaks. The first three times as big as the 
second. If the only effect were the compression shown in 
Pig. 2 3 then at high intensities the two response peaks in 
Pig. 4 would differ by only a few percent. In fact, the 
waveform of the stimulus is preserved in the response, and 
the first response peak is always substantially larger 
than the second. Similar results have been observed for 
auditory -nerve fibers (Rose et al., 1971) • 

IV. SUMMARY 

We have presented results from both mathematical 
analysis and computer simulation of a model for the trans- 
duction of motion of the basilar membrane to spike genera- 
tion by single fibers of the auditory nerve. The model can 
be considered in a physiological context as representing the 
generation and depletion of (electrochemical) quanta in a 
hypothetical hair cell. 

The model is economical, in the sense that it is 
specified by only three parameters. These parameters are 
the average rate at which quanta are generated, the proba- 
bility of a quantum disappearing without causing an event, 
and the zero-signal probability of a quantiim disappearing 
and causing an event. These three parameters determine 
spontaneous and maximum firing rates and the time constant 
of recovery following Intense stimulation. There is a 
simple relationship between amplitude of input signal and 
probability of a quantum disappearing and causing an event. 
In some of the simulations reported here a refractory 
period following an event was Introduced. 

The model reproduces the normalization of input 
signal observed in nerve-cell data at moderate to high 
intensities. As the level of the input signal is increased, 
the average firing rate first Increases and then saturates, 
or levels off, at high intensities. Period histograms in 
response to pure and complex tones at high intensities 




93 



Schroeder and Hall: HAIR CELL MODEL 



retain the waveform of the half-wave rectified Input, In 
keeping with neural data, and It Is not necessary to Intro- 
duce ad hoc prenormallzatlon schemes such as have been 
required with threshold-crossing models, although there may. 

In fact, be considerable "normalization" In the basilar- 
membrane mechanics (Rhode, 1971) • 

REFERENCES 

Gray, P. R. (196?) "Conditional Probability Analysis of the 
Spike Activity of Single Neurons," Blophys . J. 1, 759-777- 
Hall, J. L. and Schroeder, M. R. (1973) "Computer Simu- 
lation of a New Hair Cell Model," J. Acoust. Soc. Amer. 
ii, 283 CA). 

Klang, N. Y.-S., Watanabe, T., Thomas, E. C., and Clark, L. F. 
(1965) Discharge Patterns of Single Fibers In the Cat^s 
Auditory Nerve , (M.I.T. Research Monograph No. 35, 

Technology Press, Cambridge). 

Logan, B. F. and Shepp, L. A. (1973) ”A Birth and Death Model 
of Neuron Firing," to be published In J. Appl. Prob . 

Molnar, C. E. and Pfeiffer, R. R. (1968) "Interpretation of 
Spontaneous Spike Discharge Patterns of Neurons In the 
Cochlear Nucleus," Proc. IEEE 993-1004. 

Rhode, W. S. (1971) "Observations of the Vibration of the 
Basilar Membrane In Squirrel Monkeys Using the M*6ssbauer 
Technique," J. Acoust. Soc. Amer. 4^, 1218-1231- 
Rose, J. E., Hind, J. E., Anderson, D. J., and Brugge, J. F. 
( 1971 ) "Some Effects of Stimulus Intensity on Response of 
Auditory Nerve Fibers In the Squirrel Monkey," J. 
Neurophyslol. _3il3 685-699- 

Schroeder, M. R. and Hall, J. L. (1973) "A Model for 

Mechanlcal-to-Neural Transduction In the Auditory Receptor 
(’hair cell’)," J. Acoust. Soc. Amer. 283 (A). 

Weiss, T. F. (1966) "A Model of the Peripheral Auditory 
System," Kybernetlk 3.3 153-175- 




94 



COMMENTS ON: ”A Model for mechanical to neural transduction in the 
auditory receptor” (SCHROEDER AND HALL) 

H. DUIFHUIS 

Institute for Perception Research, Eindoven, The Netherlands 

Schroeder and Hallos model gives an elegant and easily visualizable des- 
cription of adaption of the auditory receptor cell. However, other feat- 
ures of the model, including mathematical analysis of the transfer prop- 
erties of the model, have received considerable attention in the litera- 
ture (Siebert and Gray, 1963; Siebert, 1965, 1968, 1970, 1972; Duifhuis , 
1971,1972). In particular Siebert *s 1972 paper proposes a rate function 
identical to the one following from the present model (including satura- 
tion). The models taking account of synchronization are using non- 
homogeneous Poisson processes. Even after discretization of time this 
term appears adequate as long as the bin -width is small enough to allow 
no more than one spike per time-bin (before averaging). 

A serious problem with Schroeder and Hall’s model is that it does not 
predict sufficient onset saturation and it does predict a decrease of 
adaptation time-constant with increase in level. These predictions do 
not agree with auditory-nerve fiber data. (e.g. Smith, 1973). 



REFERENCES 

Duifhuis, H. (1971). ”A tentative firing model for the auditory receptor 
cell,” (presented at the 7th ICA, Budapest). 

Duifhuis, H. (1972). Perceptual Analysis of Sound , Ph.D. thesis, Ein- 
doven University of Technology. 

Siebert, W.M. and Gray, P.R. (1963). ’’Random process model for the firing 
pattern of single auditory neurons,” MIT - RLE - QPR, 24i-245. 

Siebert, W.M. (1965). ’’Some implications of the stochastic behavior of 
primary auditory neurons,” Kybernetik 2o6 - 215. 

Siebert, W.M. (1968). ’’Stimulus transformations in the peripheral 

auditory system,” in: Recognizing Patterns , Kohlers, P.A. and Eden,M. 
(eds.), (MIT Press, Cambridge, Mass.). 

Siebert, W.M. (197o). ’’Frequency discrimination on the auditory system: 
Place or periodicity mechanisms?” Proc. IEEE 723-73o. 

Siebert, W.M. (1972). ’’What limits auditory performance?” Proc. 4th Conf. 
Int, Union of Pure and Applied Bio., Moscow. 

Smith, R.L. (1973). Short-term Adaptation and Incremental Responses of 
Single Auditory-nerve Fibers . Ph.D. thesis, Syracuse University, N.Y. 




95 



A "SECOND FILTER" ESTABLISHED WITHIN THE SCALA MEDIA 

(General Comment) 

E. Z WICKER 

Institut fiir Electroakustik, Techn. Universitat Miinchen, FRG 

In several manuscripts contributed to this symposium (cf. 
DALLOS, EVANS, HELLE, TONNDORF, WILSON) a possible sharpening 
of the basilar membrane's frequency selectivity by an addition- 
al system is mentioned, which may be called the "second filter*.* 
Since we have observed fluid motions in the gap between the 
tectorial membrane and the organ of Corti in the unfixated ear 
of the domestic pig when stimulating the stapes by large vibra- 
tion amplitudes (^wicker, 1972), we thought about this "second 
filter" as possibly being installed within the scala media. 



Two different kinds of models have been realized in which the 
hydrodynamical effects can be studied which take place within 
the scala media and especially in the neighbourhood of the 
hairs of the sensory cells: One modelling the total inner ear 
from the stapes up to the helicotrema including an imitation of 
both organ of Corti glued on the basilar membrane and of tecto- 
rial membrane as described by HELLE (1974 and paper presented 
in this symposium) . The other modelling only a part of the in- 
ner ear in which the traveling wave along the basilar membrane 
becomes effective (without oval window and helicotrema) as des- 
cribed in detail elsewhere (ZWICKER 1974 a, b) . However in this 
second model, the traveling wave along the basilar membrane is 
imprinted! To stimulate the discussion on the "second filter" 
some of the results obtained by this technique are summerized: 
The model is a 200:1 imitation of a 5 mm long segment of the 
inner ear with a length of about 1 m. A crossection is shown in 
Fig. 1 , where the scala tympani (ST) is the upper part with 
open surface. The basilar membrane (BM) with organ of Corti 
(CO) , sulcus spiralis (SS) , tectorial membrane (TM) and Reiss- 
ner*s membrane (RM) represent the scala media (SM) . Scala ve- 
stibuli (SV) is the lower part. The whole model is filled with 
water. The system of ordinates refers to the narrow gap between 
CO and TM. TM is porous and intransparent. All other parts are 




96 



Zwicker: A "SECOND FILTER" 




Fig, 1 : Crossection of the model. The 
dashed parts indicate a few of the 23 
tappets hy which the traveling wave 
along the basilar membrane is imprin- 
ted . 



transparent. The buoy- 
ancy of the TM is ad- 
justed in such a way 
that without vibration 
of the tappets the gap 
between CO and TM is 
just closed. 

Imprinting a traveling 
wave with 1 Hz the main 
hydrodynamic effects 
can be described as 
follows ; 

a) In the area in which 
the envelope of the 
traveling wave (TW) in- 
creases, a DC-stream in 
x-direction is observed 



within the sulcus spi- 



ralis. Near the maximiim of the TW-envelope this stream gets an 



additional component in z-direction. In the range of decreasing 



TW-envelope the stream follows z-direction almost completely 



(same direction as fibrilles of the TM) . 



b) As shown in Fig. 2 the width of the gap between CO and TM 
depends on time (expressed by phase angles within one period) 

and space (expres- 
sed in cm of the 
x-coordinate) . The 
opening of the gap 
occurs only in re- 
gions near and be- 
hind the maximum of 
increasing TW-enve- 




00 cm 



70 



60 



50 



40 



30 



20 



Fig . 2 : Envelope of the traveling wave dis- 
placement of the BM (dashed-dotted) and in- 
stantaneous displacement y of the CO (solid) 
and of the TM (dashed) at k different mo- 
ments within a period as function of the 
place X (abszissa). 



lope, while within 
the whole range of 
increasing TW-en- 
velope the vibra- 




97 



Zwicker; A "SECOND FILTER" 




tion in y-direction 
of CO and of TM are 
identical, i.e. the 
gap does not open. 

c) Out of the gap a 
flow of water in z- 
direction can be 
observed which is 
mostly unidirectio- 



Fig. 3 : Opening Ay ofthe gap at the place 
X = 69 cm as function of time t expressed 
in parts of the period 2tt ( dashed ) . Velo- 
city c^ of the fluid flow out of the gap 
in z-direction at the same place as func- 
tion of time. 



nal and impulsive. 
This flow depends 
on place and time, 
too. During the 
time when the gap 



is closed, the fluid can not flow through the gap. At the time 



when the gap opens, the fluid seems to flow a little bit into 



the gap, i.e. in the negative z-direction. However, during that 



period of time at which the gap starts to close, a strong im- 
pulsive flow out of the gap can be observed (see Fig. 3) . Since 
only the frontwave of stained fluid can be measured, its velo- 
city c^ does not drop to zero immediately after the closing of 
the gap but with some "tail" because of inertia, although the 



flow within the gap is already zero. On other places, the im- 
pulsive fluid flow occurs at different moments within the peri- 



od, but the relation between opening of the gap and the velo- 
city of the gap-flow behaves always similar: The peak velocity 
is reached just before the gap closed up. 

d) The maximum of the gap opening depends on the place in such 
a way that it can be described as the derivative of the TW-en- 



velope, i.e. it has its maximum at the place where the TW-en- 



velope has its steepest slope but becomes zero at the place 
where the TW-envelope has its maximum. The maximum values of 



the velocity c^ depend on place in almost the same form. 

e) The maximum value of the velocity c^ of the fluid particles 
moving out of the gap is about 10 times larger than the maxi- 



mum velocity imprinted at the BM on the same place. This indi- 




98 



Zwicker: A "SECOND FILTER" 

cates not only that an increment of the spatial selectivity 
takes place but also that the energy of the soundwave stimula- 
ting the ear is transformed and transferred to the place where 
it*s needed^ namely the gap between the CO and the TM. Here the 
groups of hairs of the sensory cells may be bent by the fluid 
velocity and thereby activated. 

f ) It could be observed that the hydrodynamical system instal- 
led within the scala media follows the vibration of the BM 
without delay and almost without inertia. Although the spatial 
extent of the velocity c^ is narrower, the temporal pattern of 
c^ follows that of the BM. A single period of the imprinted 
traveling wave, even half a period of the right phase is enough 
to produce in the gap a velocity c^ of almost the same maximal 
amplitude as a vibration of the BM in steady state condition. 
The slow stream of liquid within the sulcus spiralis (SS) on 
the other hand (see a) ) shows observable inertia because of 
its much larger mass. 



ACKNOWLEDGEMENT: This work vas carried out within the Sonder- 
f orschungsbereich Kybernetik Miinchen, which is supported by the 
Deutsche Forschungsgemeinschaf t . 



REFERENCES 

HELLE 5 R. (197^) "Selekt ivitat s ste igerung in einem hydr omechani - 
schen Innenohrmodell mit Basilar- und Deckmembran" . 
Acustica 31 (in press). 

ZWICKER, E. (1972) "Investigations of the inner ear of the 
domestic pig and the squirrel monkey with special 
regard to the hydromechanics of the cochlear duct". 
Symposium on Hearing Theory (IPO Eindhoven, Holland). 

ZWICKER, E. ( 197 ^ a-) "Ein hydr omechani s che s Aus schnittmodell 
des Innenohres zur Erforschung des adaquaten Reizes 
der Sinnes zellen" . Acustica 31 (in press). 

ZWICKER, E. ( 197 ^ b) "Spaltweite und Spalt str omung in einem 
Ausschnittmodell des Innenohres". Acustica 31 (in 
press ) . 




99 



ADDITIONAL REMARKS 

WILSON: I would like to make a comment silimar to the one I made in 

respect of Mr. Helle’s model, although I believe the implications for 
your case could be of greater significance. If, as you show, the opening 
and closure of the gap between tectorial and basilar membranes is 
modulated at signal frequencies we have the interesting possibility that 
this could perform a functional multiplication operation on conventional 
shearing motion between stereocilia and the tectorial membrane. Such a 
process could be analagous to phase sensitive detection and lead to 
sharp frequency filtering properties. Stated more simply, if there are 
phase differences between the traveling- wave and gap opening (as you 
show in your Fig. 2) at some position along the membrane the gap will 
close at the same time that maximal shearing motions occur giving 
maximal hair cell excitation. At neighboring positions, however, the 
maximal shearing motions will occur when the tectorial membrane is out 
of contact with the stereocilia and during the times of contact the 
shearing motions will be at their minimum giving little output. Such 
a model would appear to have many attractive features at the level of 
the inner hair cell. 

ZWICKER: I agree completely. For further features and implications 

I would like to refer to my 1974 a/b papers. 




100 



AN ALTERNATIVE APPROACH TO THE SECOND FILTER 
(General Comment) 

H. DUIFHUIS 

Institute for Perception Research, Eindhoven, The Netherlands 
Introduotion 

This comment presents some speculative ideas about a possible 
site of "the second filter" and of a compressing nonlinearity. 
The class of transducers consisting of filter, nonlinearity and 
second filter has implications for a variety of psychophysical 
and physiological phenomena. Two- tone suppression, frequency 
selectivity (sharpening) , and combination tones are amongst the 
most obvious of these phenomena. Hence, this comment relates to 
a number of contributions to this symposium. 

The major new elements in the present approach are: (1) a 

suggestion for the possible location of the second filter is 
worked out, (2) it is assumed that tuning frequencies of first 
and second filter are unequal, and (3) second filter and non- 
linearity are assumed to be coupled. 

A possible shoP'pen'ing mechanism at the hair cell 

It has been established that the hair cell is directionally 
sensitive. Movement of the cilia towards the remnant base of 
the kinocilium is excitatory (e.g., Lowenstein & Wersall, 1959). 
Such a motion is most likely to occur under conditions of radial 
shear between tectorial membrane and cuticular plate. There are 
indications (Tonndorf) that the vibration mode of the basilar 
membrane excitation rotates from radial through transversal to 
longitudinal, when proceeding in apical direction. These obser- 
vations inspired the following more specific assumptions: 

A) Let a CF be assigned to the inner hair cell located at x mm 
from the base. The CF^ is defined as the frequency that stimu- 
lates the hair cell in its most sensitive direction. 

B) Frequencies off CF stimulate the hair cell under an angle 0 
with its most sensitive direction, and are therefore less 
effective. The angle monotonically approaches tt/ 2 if the dif- 
ference between stimulating frequency / and CF^ increases. 




101 



Duifhuis: ALTERNATIVE TO SECOND FILTER 

C) The tuning frequency of the mechanical excitation pattern at 
X mm from the base is a*CF , with a > 1. 

X 

D) At the hair cell the mechanical excitation undergoes a com- 
pression which is uniform in all directions. 

In addition some quantitative descriptions are needed for a 
further analysis of the consequences of the above assumptions. 
The following will be used: 

Transfer function of 1st filter: 



E^f,x) = 



X. 



aCF 



( 1 ) 



with s = s,, (ic) if f<aCF 
1 11 X 



1 ^12 

Directional sensitivity angle 6 : 



s ix) if f>aCF 



0(f.x) = I A - 

X 



with ^21 



(X) if /■ < CF 



— X 






ix) if / > CF^ 



2 22 

The sensitivity factor is cos0 . 

The compressing nonlinearity (input i(t), output y it) ) 






® 22^®21 > 



( 2 ) 



y{t) = sign{i (t) } • I i (t) 
If we approximate cos0 by cos0 - 



0 < V < 1 



(3) 



(//CF ) ^2 ^ then it is not 



difficult to show that the tuning curve (based on a constant 



time averaged output at the hair cell) has slopes of -6(s^ + 



dB/oct. 

Two-tone 'tnteraot'Lon 

Consider the stimulus 



(4) 



a^sino)^t + a^sincjo^t. 



Let 0 ). = 2ttCF ; 

1 X 



then 0^= 0, and 



0 is given by Eq.2 with /= / . Therefore, 



^ sensitivity axis 




r( t) 



02(t) 



Fig.l. Vector sum of contributions 
of a 2-tone stimulus. Tone 1 is 
at CF^, tone 2 is off CF^. 



at the hair cell one has the 
vectorsum of the two components 
(Fig.l). Since f^, the angle 

between the resultant vector 
r(t) and the sensitivity axis of 
the hair cell, will be a function 
of time t. The resultant v {t) can 
be decomposed into the following 
orthogonal components : 

^/^t)= a^sirui^^t + a^cosQ ^slnii^^t 




102 



Duifhuis: ALTERNATIVE TO SECOND FILTER 

which equals: r^y(t)= r (t) cosi(;(t) , (5a) 

and: rj^(t)= a^slnB^slnu^^t = r(t) sini|;(t) . (5b) 

The compression will affect |p(t)[, but it will not change ijj(t). 
Let the result of the compression be R (t) , then the effective 
stimulating waveform is R^y(t) : 





R^^(t) = R(t) cosij;(t) = r (t)^ cosjjj (t) . 


(6) 


This 


can be written as: 






R^(t) = r^ (t)^ cos^ (t) 


(6a) 


or: 


= 3 :^ (t)* PjL(t)^ ^ sin (t) . 


(6b) 


(For 


one should read: sign (i ) • | i ] ^ . ) 




Two-tone suppression 





From 'Eqs.5 and 6 b it is evident that if and a^> , 

then sin^”^\li(t) -> 1 and R^^(t) decreases with increasing a^- 
This will be true as long as a^cosS^ very high values 

of a 2 the suppressing tone will determine the average value of 
R//(t) , which then follows It can be shown that the bounda- 

ries of the inhibition 
areas (defined as the 
area where tone 2 sup- 
presses the average re- 
sponse to tone 1 by, or 
by more than, a certain 
factor) approximate the 
membrane selectivity on 
the one side (Eq.l) and 
the tuning curve on the 
other side. Fig. 2 shows 
the results of a com- 
puter simulation of 
this model. The result 
can be verified analyt- 
ically . 




Frequency of tone 2 (kHz) 

Fig. 2. Two-tone suppression areas (shaded) and 
tuning curve predicted by the model. Parameter 
values: 4; 8 16; ^ 2 ^= 6 ; 822 = 20; 

a= 1.4; v= 0.8 . Suppression conditions: 
fixed tone 1 , 0 dB at 1 kHz; suppression in 
shaded areas > 20 %. 



103 



Duifhuis: ALTERNATIVE TO SECOND FILTER 
Pure-tone mashing 

For the iso-Lp masking curve (Vogten, Zwicker) the model pre- 
dicts slopes of +s^{fjX)} dB/oct. Also a linear 

relation between Lp and L^ is predicted (as long as does not 
depend on level) . There are two ways to arrive at this result: 

(1) timing model: assume that the ratio of the Fourier coeffi- 
cients of /p and in Ry^(t) determines the probe threshold; 

(2) rate model: assume that at probe threshold the average of 

in response to probe + masker is just discriminable from 
the average in response to masker alone, and assume that the 
variance in rate is proportional to the mean. 

Pulsation threshold 

If the pulsation threshold requires a constant rate in the 
probe "channel" (Houtgast) , then obviously the criterions for 
tuning curve and pulsation-threshold masking-curve are similar. 
Hence, equal frequency selectivity is predicted for neural 
tuning curve and for pulsation threshold: 6-{s^+ s^/vl dB/oct. 
Note that these curves are narrower than the pure-tone masking 
curve, because v < 1. 

Combination tones 

Smoorenburg * s conclusions based on a similar compressing non- 
linearity apply also to the present model. For a best fit to 
the data one has to assume propagation in some conditions. With- 
out propagation it also seems impossible to explain (1) the 
suppression of high-frequency distortion products and (2) the 
neurophysiological data of Goldstein and Kiang (1968) . Available 
evidence (Dallos, Wilson) makes it unlikely that the propagation 
occurs along the basilar membrane. Therefore it becomes impor- 
tant to investigate other potential propagation mechanisms with- 
in the organ of Corti. 

Refevenoes 

The unspecified references are made to contributions to this symposium. 
Goldstein, J.L., and Kiang, N.Y.S. (1968). "Neural correlates of the aural 
combination tone 2fi-f2", Proc.IEEE 56 , 981-992. 

Lowenstein, O., and Wersall, J. (1959). "A functional interpretation of the 
electron-microscope structure of the sensory hairs in the cristae of the 
elasmobranch Raja Clavata in terms of directional sensitivity". Nature 
184, 1807-1808. 




. Auditory Frequency Analysis 




107 



NEUHO-MECHMICAL FREQUENCY MALYSIS IN THE COCHLEA 
J. J. ZWISLOCKI AND ¥. G. SOKOLICH 

Institute for Sensory Research, Syracuse University, Syracuse, N.Y. U.S.A. 



It has become increasingly evident that the mechanical filter action 
of the cochlea is sharpened before neural spikes reach a recording micro- 
electrode in the auditory nerve. When the sharpening mechanism is damaged, 
the neural tuning curves more closely parallel the mechanical ones (Kiang, 
et al . , 1970; Evans and Wilson, 1973). This paper summarizes some of our 
experimental results suggesting one of the ways a neural sharpening may 
take place. Our conclusions are based on the premise that most if not all 
microelectrode recordings from the auditory nerve concern the radial fi- 
bers which end on the inner hair cells — a consequence of Spoendlin*s (1966, 
1970 ) finding that they constitute 90 to 95^ of all auditory-nerve affer- 
ent s. If this is so, changes in the recorded frequency- or time charac- 
teristics following selective elimination of the outer hair cells suggest 
that, in a normal ear, these cells interact with the radial fibers, either 
directly, or indirectly via the inner hair cells. Of coiirse, such evi- 
dence is not conclusive since elimination of the outer hair cells may not 
be completely selective and may be accompanied by subtle changes in the 
inner hair cells. Although anatomical evidence for the interaction is 
scant (Perkins, 1973 )j some structural relationships are suggestive. The 
spiral fibers course quite close to the inner hair cells and join the rad- 
ial fibers at the entrance to habenula perforata. There seem to be as 
many spiral fibers as inner hair cells and habenular openings. 

Our experiments aimed primarily at discovering whether the activity 
of the radial fibers depends exclusively on the inner hair cells or on the 
outer hair cells as well. They were begun under, the tentative assumption 
that the outputs of inner hair cells were controlled by the velocity and 
the outputs of outer hair cells by the displacement of the basilar mem- 
brane. The assumption originated with Dallos's et al. (1972) recordings 
of cochlear microphonics in the absence of outer hair cells. We reasoned 
that detection of both velocity and displacement components in the same 
fibers would signify a combined effect of the inner and outer hair cells. 




108 



Zwislocki and Sokolich: NEURO-MECHMICAL FREQUENCY ANALYSIS 

To separate the two components, we used sound stimuli that produced an ap- 
proximately trapezoidal wave pattern in the cochlea. The fundamental per- 
iod of the stimuli was 25 msec to avoid ambiguities resulting from neural 
latencies. The experiments were performed on anesthetized Mongolian ger- 
hils (Sokolich and Smith, 19T3) using micropipettes filled with a 3M NaCl 
solution and having resistances of 30 to TO megohms. All recordings came 
from nerve fibers, as judged from monophasic spike forms and short laten- 
cies of responses to rarefaction clicks. Representative PST histograms ob- 
tained from three fibers with CFs in different frequency ranges are shown 
in Fig. 1. The uppermost trace indicates the cochlear microphonics re- 
corded at the round window, the upward deflection coinciding with the dis- 
placement of the basilar membrane toward scala vestibuli. The lowest trace 
indicates the spontaneous activity of the unit with the 6.1-kHz CF. Note 
that the responses of all three units contain both velocity and displace- 
ment components. This was typical of all recordings obtained from gerbils 
with normal ears and led us to the early conclusion that the inner and 
outer hair cells interact (Zwislocki and Sokolich, 1973). However, a 
closer scrutiny of the histograms of Fig. 1 makes the assumption of a sim- 
ple superposition of velocity and displacement components unlikely. In the 
1.8-kHz unit, the maximum firing rate occurs during the inferred motion of 
the basilar membrane toward scala vestibuli, in the U.9-kHz unit, during its 
motion in the opposite direction; in the 6.1-kHz unit, both directions ap- 
pear to be excitatory. These differences are not stochastic but rather 
systematic. The first pattern is typical of practically all units with CFs 
below about 2 kHz, the second, of units between about 2 and 5 kHz, and the 
third, of units with higher CFs. If we maintained the assumption that the 
velocity components were produced by the inner hair cells, we would have 
to conclude that the response polarity of these cells changed from one 
cochlear turn to another — a most unlikely occurrence. Instead, we found it 
possible to reconstruct all the patterns of Fig. 1 by assuming that they 
reflect the difference between two imperfectly matched components of oppo- 
site polarity. Under this assumption, the apparent polarity reversals re- 
STilt from small changes in latency and amplitude balance (Zwislocki, 197^). 
The assxamption is also consistent with the displacement responses of Fig. 1. 
They are positive in the 1.8- and 4.9-kHz units during basilar-membrane 
displacement toward scala tympani, and are negative in the 6.1-kHz unit 




109 



Zwislocki and Sokolich: NEURO-MECHMICAL FREQUENCY ANALYSIS 




Fig. 1. PST histograms of three auditory-nerve fibers with 
different CFs in response to trapezoidal wave pattern. Up- 
per trace — roxmd-window CM; S^ — scala vestihuli; S^ — scala 
tympani. Bottom trace — spontaneous activity (SP.A.). Num- 
bers indicate time intervals in msec. Histograms with 0.1 
msec bins and 2000 repetitions. 




Fig. 2 . PST histograms of two fibers of a kanamycin- 
treated animal. Stimulus and histograms the same as 
in Fig. 1. 



110 



Zwislocki and Sokolich: NEURO-MECHMICAL FREQUENCY ANALYSIS 

during its displacement toward both scalae, tympani and vestibuli. These 
patterns are typical for low, medium, and high-frequency units, respectively. 

More formally, we developed the hypothesis that the inner and outer 
hair cells interact in phase opposition. On the basis of anatomical rela- 
tionships, it appeared reasonable to expect that the inner hair cells would 
produce excitatory responses during displacement of the basilar membrane 
toward scala vestibuli, and the outer hair cells, during its displacement 
toward scala tympani. To test the hypothesis, we poisoned some gerbils 
with kanamycin which tends to destroy the outer hair cells differentially, 
especially in the basal cochlear turn. Single-fiber recordings from the 
poisoned animals clearly confirmed our hypothesis (Sokolich and Zwislocki, 
1973). Many "units with medium and high CFs produced simple responses in 
which excitation was associated with the displacement of the basilar mem- 
brane toward scala vestibuli, and inhibition, with its displacement toward 
scala tympani— a response pattern we expected from the inner hair cells. 

As a dividend, we found that, in more severely damaged cochleas, low-fre- 
quency units also produced simple responses but of reversed polarity which 
we expected of outer hair cells. Surface preparations of the organs of 
Corti did not show any missing hair cells in the corresponding parts of the 
cochlea. However, Engstrom and Kohonen ( 1965 ) found that ototoxic antibi- 
otics, like kanamycin, tend to produce a greater damage of the inner than 
of the outer hair cells in the apical turn. In general, our recordings in- 
dicated that functional changes occurred before any damage could be detected 
in surface preparations. A sample of our results on kanamyc in-treated ger- 
bils is shown in Fig. 2. They were produced with the help of the same stim- 
ulus pattern (CM trace at the top) as the normal results of Fig. 1. The 
middle trace indicates an excitatory response during motion of the basilar 
membrane toward scala vestibuli. According to our hypothesis the response 
should be associated with an inner hair cell. In the lowest trace, the re- 
versed response polarity appears to indicate the effect of outer hair cells. 
Note that both sets of responses stem from units with similar low CFs. ¥e 
chose to show this coincidence to indicate that the polarity reversal is 
not due to differences in cochlear location per se . However, the response 
polarity of the middle trace is only rarely found in units with CFs around 
1 kHz. On the other hand, we never found the response polarity of the 
lowest trace in units with CFs over 2 kHz. 




111 



Zwislocki and Sokolich: NEURO-MECHMICAL FREQUENCY ANALYSIS 

Another observation must he made in connection with Fig. 2. Both units 
produced essentially velocity responses. This is not true for all units in 
kanamycin-treated animals. Often almost p\ire displacement responses are 
found. Velocity responses appear to occur more frequently in more severely 
damaged cochleas. This could explain Dallos*s et al. (1972) finding of CM 
velocity responses in the absence of outer hair cells. 

The results obtained with trapezoidal stimiili could be confirmed using 
sinusoidal ones. The latter allowed us to make some quantitative measure- 
ments. The phase lag of the response of a unit may be described by the 
mathematical expression 0 = 27rft + (J), where f means sound frequency, t, 
time lag, and (p, the phase lag of neural excitation relative to the maximum 
displacement of the basilar membrane toward scala tympani at the location 
of the relevant hair cell. When t is constant 0 depends linearly on f, and 
the intersection of the resulting straight line with the abscissa zero de- 
termines (j). Results for three units with low CFs and two units with high 
CFs are shown in Fig. 3. The relationships between 0 and f are approxi- 
mately linear for all units below their respective CFs. The slopes are 
greater for the low-frequency \mits, in agreement with their greater re- 
sponse latencies. Their phase curves intersect the abscissa zero at a 



Fig . 3 ■ Phase lag of five 
fibers from two kanamycin- 
treated animals as a func- 
tion of sound frequency of 
a sinusoidal stim-ulus. 
Approximate reference: 
max. displacement of the 
basilar membrane toward 
scala tympani in the vi- 
cinity of the oval window. 
Unfilled symbols refer to 
units with CFs above 3 kHz. 





112 



Zwislocki and Sokolich: NIURO-MECHMICAL FREQUENCY ANALYSIS 

phase lag of -70° to - 90 °, in agreement vith a velocity response that co- 
incides with the basilar-membrane motion toward scala tympani. The phase 
curves of the high frequency units intersect the abscissa zero at a phase 
lag of approximately -l 80 °, in agreement with a displacement response that 
coincides with basilar-membrane displacement toward scala vestib-uli. These 
characteristics are completely consistent with responses to trapezoidal 
stimuli we already described and confirm our basic hypothesis. It should be 
mentioned again that clearly abnormal responses of low-frequency units are 
associated with extensive cochlear damage in whose presence velocity rather 
than displacement responses are found. This explains the -70° to -90° phase 
lag in the curves of the low-frequency units. 

If the two kinds of responses we found in auditory-nerve fibers of kan- 
amyc in-treated gerbils are actually produced by opposing inputs from the 
inner and outer hair cells, a possible neural mechanism for sharpening the 
mechanical filter action becomes evident. The situation is sketched in Fig, 
k. The solid curve shows the mechanical amplitude distribution derived from 
Rhode’s (1971) measurements, assuming a log frequency distribution in the 
cochlea. Increasing the sound frequency somewhat would move the ciirve to 
the position indicated by the intermittent curve. The maxim-urn of the solid 
curve coincides with the position of an inner hair cell schematized together 
with a radial fiber. The outer hair cells that could interact with the in- 
ner hair cell or the radial fiber via a spiral fiber are displaced relative 
to the location of the inner hair cell toward the oval -vd.ndow approximately 
by the length of the spiral fiber. When the sound frequency is decreased 
gradually, the mechanical amplitude envelope is swept first past the outer 
hair cells then past the inner hair cell. As a result, both the group of 
the outer hair cells connected to the same spiral fiber and the inner hair 
cell receive approximately the same frequency distribution of amplitude, 
except for a small frequency shift. The situation is schematized in Fig. 5 
where the solid curve indicates the amplitude distribution ’’seen" by the 
outer hair cells, and the intermittent curve, the one "seen" by the inner 
hair cell. The latter has been so normalized with respect to the former 
that both coincide approximately at low frequencies. Note that, under these 
conditions, the maximum of the curve associated with the inner hair cell 
protrudes above the curve associated -with the outer hair cells. At low in- 
tensity levels, the neural response is approximately linearly related to the 




113 



Zwislocki and Sokolich: NEURO-MECHANICAL FREQUENCY ANALYSIS 




Fig . . Schematic amplitude distribution of basilar-membrane displacement 

for two sound frequencies. Relative positions of outer (O.H.) and inner 
(l.H.) hair cells are also indicated (not to scale). 




SOUND FREQUENCY IN KHZ 



Fig. Schematic amplitude of basilar membrane as a function of sound fre- 
quency at the locations of the middle outer hair cell and of the inner hair 
cell, as indicated in Fig. k, l.H. -amplitude normalized so that it is 
slightly smaller than the O.H. -amplitude at low frequencies. 



114 



Zwislocki and Sokolicla: NEURO-MECHMICAL FREQUENCY ANALYSIS 



vibration amplitude. As a consequence, the two ciirves of Fig, 5 may he in- 
terpreted as response contributions of the outer and inner hair cells. If 
the contributions are in phase opposition, the resultant response is deter- 
mined by the difference between the two curves. This difference has been 
calculated on the basis of Rhode's (l9Tl) measurements of basilar -membrane 
amplitude and is compared in Fig. 6 with a rather typical empirical tuning 
curve (Kiang ^ al . , 1967 ) • Note that the slopes of both curves are quite 
similar. The theoretical timing curve contains a singular point that can 
only arise in the presence of exact phase opposition — not likely to be 
present in the real cochlea. As a consequence, the singular point is with- 
out significance. 

The model of Figs. ^ to 6 can be tested by measuring the response phase 
of normal auditory-nerve xmits as a function of sound frequency. The two 
curves of Fig. 5 predict the basic course the phase should take. At low 
frequencies, the contribution of the outer hair cells should predominate 
slightly, and the phase should tend to follow a straight line intersecting 
the abscissa zero at a phase lag of zero, as indicated by the dash-dot line 
in Fig. 3. At and above CF, the contribution of the inner hair cells is ex- 
pected to be minimal, and the phase values should fall on the extrapolation 
of the low frequency curve. Between these two extremes. Fig. 5 indicates a 
dominant contribution of the inner hair cell. As a consequence, the phase 



Fig. 6 . Comparison between 
an empirical timing curve 
of an auditory-nerve fiber 
and a tuning curve con- 
structed from measured dis- 
placement amplitudes of the 
basilar membrane (Rhode, 
1971) according to the 
schema of Fig. 5- 





115 



Zwislocki and Sokolich: NEUKO-MECHMICAL FREQUENCY ANALYSIS 




Fig. 7 . Phaser representation of interaction hetween the contributions of 
the outer and inner hair cells. Deviation of the I.H.-phasor from the neg- 
ative zero axis is purposely exaggerated. 




Fig. 8 . Phase lag of a fiber from an untreated (O) and of a fiber from a 
kanamyc in-treated animal (X) with slight cochlear damage. Dashed line indi- 
cates the phase function expected in the absence of the contribution from 
the inner hair cells. 




116 



Zwislocki and Sokolich: NEURO-MECHMICAL FREQUENCY ANALYSIS 

values should deviate from the ciirve by up to l80°. Whether the deviation 
is in the positive or negative direction woxild depend upon subtle departures 
from exact phase opposition between the corresponding inner and outer hair 
cells. The phase relationships are schematized in Fig. T by means of a 
phasor diagram. Departure from exact phase opposition is strongly exag- 
gerated for purposes of exposition. Two examples of measured phase rela- 
tionships are shown in Fig. 8. The dashed curve indicates the phase values 
that would be expected if the contribution of the inner hair cells were 
eliminated. It nearly coincides with the dash-dot curve of Fig. 3. The 
solid lines interpolate the data points obtained on one unit of an untreated 
animal (U26-19) and on one unit of a kanamyc in-treated animal with a mild 
cochlear damage (k6-i 4). Both units had approximately the same CF of l.T 
kHz, and the responses of unit K6 -iU deviated only moderately from normal. 
Clearly, both sets of data agree with the prediction. 

We conclude that firing patterns recorded from auditory-nerve fibers 
appear to result from two components in phase opposition. Structural and 
functional relationships suggest that the two components are generated re- 
spectively by the inner and outer hair cells. 



Acknowledgment s 

The anatomical surface preparations were made by Dr. R. P. Hamernik of 
the State University of New York Upstate Medical Center. Work supported by 
NIH grant NS03950. 



References 

Dallos, P. , Billone, M. C., Durrant, J. D., Wang, C.-y., and Raynor, S. 

(19T2). Cochlear inner and outer hair cells: functional differences. 
Science ITT, 356-358. 

Engstrom, H. and Kohonen, A. (1965). Cochlear damage from ototoxic anti- 
biotics. Acta Oto-laryng. 59, 1T1-1T8. 

Evans, E. F. and Wilson, J. P. (19T3). The frequency selectivity of the 
cochlea. Basic Mechanisms in Hearing. A. R. Biller, Ed. Academic 
Press, New York. 

Kiang , N . Y . -S . , Moxon , E . C . , and Levine , R . A . ( 19T 0 ) . Auditory-nerve 
activity in cats with normal and abnormal cochleas. Ciba Foundation 
Symposium on Sensorineural Hearing Loss. G. E. W. Wolstenholme and 
J. Knight, Eds. J. & A. Chiir chill, London. 

Kiang, N.Y.-S., Sachs, M. G. , and Peak, W. T. (196T)« Shapes of tuning 
curves for single auditory-nerve fibers. J. Acoust. Soc. Am. ^2, 
13^1-1342. 




117 



Zwislocki and Sokolich: NEUHO-MECHMICAL FREQUENCY ANALYSIS 



Perkins, R. E. (1973) • Innervation patterns in cochleas of cat and rat: 
study with rapid Golgi Techniques. Anatomical Record 175, ^IO(A). 

Rhode, W. S. (l97l) • Observations of the vibration of the basilar membrane 
in squirrel monkeys using the Mossbauer technique. J. Acoust. Soc. Am. 
k9, 1218-1231. 

Sokolich, ¥. G. and Smith, R. L. (1973). Easy access to the auditory nerve 
in the Mongolian gerbil. J. Acoust. Soc. Am. 5^^, 283(A). 

Sokolich, ¥. G. and Zwislocki, J. J. (1973). Evidence for phase opposition 
between inner and outer hair cells. J. Acoust. Soc. Am. (in press) (A). 

Spoendlin, H. (1966). The organization of the cochlear receptor. Advances 
in Oto-Rhino-Laryngol . 13 (L. Ruedi, Ed.) S. Karger, Basel. 

Spoendlin, H. (1970). Structural basis of peripheral frequency analysis. 
Frequency Analysis and Periodicity Detection in Hearing. R. Plomp 
and G. F. Smoorenburg, Eds. A. ¥. Sijthoff, Leiden. 

Zwislocki, J. J. (197^)* A possible neuro-mecl;ianical sound analysis in the 
cochlea. Symposiim on Auditory Analysis and Perception of Speech. 
Acustica (in press). 

Zwislocki, J. J. and Sokolich, ¥. G. (1973). Velocity and displacement 
responses in auditory-nerve fibers. Science l82, 6U-66. 



ADDITIONAL REMARKS 

EVANS Your attractive model does not seem to be able to meet the following 
objections: (1) Phase cannot be preserved down the outer spiral fibre 
because the length constant must be very small (10 s of micra) compared 
with the 0.6 mm length of the fibre. (2) The model does not appear to pro- 
vide the sharpening necessary to account for the tip region of the tuning 
curve (Evans and Wilson, 1973). (3) It cannot account for the steeper 

slope of the high-frequency cut-off of the neural curves, particularly 
compared with the plateau region of the basilar membrane curves. 

ZWISLOCKI: Taking your objections one by one, I would like to state the fol- 
lowing: (1) We have made some calculations of electronic propagation in 
spiral fibers. The space constant is substantially greater than you assume. 
However, ^ a. c. propagation depends on the square root of frequency and the 
attenuation at high sound frequencies becomes prohibitive if one assumes 
electrical constants based on the squid giant axon. It is quite possible 
that these constants do not apply to spiral fibers. On the other hand, the 
model can function with the help of d.c. currents, and the other responses 
we have obtained from fibers with high CF indicate a d.c. component. (2) 

The model very clearly produces a sharpening of the tuning curve near its 
tip. (3) I am^not convinced that the high-frequency slope of the neural 
tuning curves is steeper^ than that of the mechanical responses, excluding 
the plateau. If sharpening takes place at the high frequency side, it 
would have to arise from other mechanisms, one of which could be the inter- 
action amoung outer hair cells. 




118 



AUDITORY FREQUENCY SELECTIVITY AND THE COCHLEAR NERVE 
E.F. EVANS 

Department of Communication, University of Keele, Staffs, U.K. 



1. Introduction . 

•Frequency selectivity* denotes the ability of the auditory system to 
resolve or separate out the individual frequency components of a complex 
signal. Accumulating evidence (summarised in detail elsewhere: Evans § 
Wilson, 1973; Evans 1974c) has led to the working hypothesis that the 
frequency selectivity of the auditory system is already determined at the 
level of the cochlear nerve by the filtering properties of the cochlea. 

This paper will attempt to emphasise some of the relevant properties of 
single fibres in the mammalian cochlear nerve, and to consider some of the 
factors determining them. 

2. Physiological versus psychophysical measures of frequency selectivity . 

The solid frequency threshold curves (FTCs) of Figs. 1-5 illustrate the 
approximate shape of the cochlear nerve filter function for fibres of 
characteristic frequency (CF) above about 2 kHz. The functions for fibres 
of lower CF are more symmetrical, and progressively lose the high threshold 
low frequency *tail*. These shapes (which have been found in cat, guinea 
pig, and squirrel monkey) qualitatively resemble the tone-on-tone masking 
curves obtained in man by Small (1959), by Zwicker, and by Rodenburg, 
(personal communications) . 

For the purposes of quantitative comparison it is useful to consider 
the effective bandwidth of the cochlear fibre filter (vd. Evans § Wilson, 
1971; 1973). (This is the width of the equivalent rectangular filter, and 
is derived by integrating the area under the FTC considered as an 
attenuation curve in linear power and frequency coordinates; it is 
approximately the half-power bandwidth.) Inasmuch as the psychophysical 
critical band can be understood by analogy with linear bandpass filters 
(vd. Zwicker, 1971) they can be compared with the effective bandwidths of 
the (cat) cochlear fibres. The tolerable agreement between the two sets 
of data (in spite of the species difference) supports our working 
hypothesis. 

The cochlear nerve effective bandwidths represent a filter *Q* (CF/ 





119 



Evans: AUDITORY FREQUENCY SELECTIVITY AND THE COCHLEAR NERVE 



effective bandwidth) of about 10, from 1-10 kHz. This figure agrees well 
with that derived by Duifhuis (1971) from psychophysical measurements of 
the temporal characteristics of the filter. 

Further support comes from the agreement between measurements of the 
frequency resolving power of cat cochlear fibres and of human subjects for 
acoustic * grating* stimuli, i.e: comb-filtered noise (Wilson and Evans, 
1971; Evans and Wilson, 1973; Wilson and Seelmann, to be published). 

The identification of the psychophysical critical band function with 
the cochlear nerve filter would further account for the findings which 
indicate that the critical band does not require any significant amount of 
time to be established apart from the ‘response time* of the cochlear 
filter (Zwicker and Fasti, 1972), because the cochlear filter 
characteristics hold for click stimuli also (Miller 1970; see Evans, 

1974c) . 

While there is evidence that, up to at least 30dB above threshold, the 
cochlear nerve filter acts as if it were a linear filter to broadband 
noise (e.g. de Boer, 1970; Evans § Wilson 1971, 1973), to comb-filtered 
noise (Wilson § Evans, 1971; Evans § Wilson, 1973) and to click stimuli 
(Miller 1970), it appears that at higher levels, the filter becomes 
nonlinear or less sharply tuned, or both (vd. survey by Pfeiffer § Kim, 
1973) . These properties may be reflected in non-linear masking properties 
and the increase in critical band associated with increase in stimulus 
level. In addition, other non-linearities, apparently not level- 
dependent, such as those responsible for the 2 f^^-f 2 intermodulation 
distortion product are reflected in cochlear nerve discharge patterns, and 
these correlate well, but not completely, with the psychophysics (vd. 
Goldstein, 1972). 

3. What mechanisms produce the characteristics of the cochlear nerve 
filter? 

A controversy exists as to whether the filtering characteristics can 
be accounted for by the properties of the basilar membrane. The data of 
Rhode (1971, 1973) from the squirrel monkey suggest that the basilar 
membrane may be sharply tuned at low stimulus levels and increasingly non- 
linear and broadly tuned at higher levels. Attempts have been made to 
account for the high level non-linearities in the cochlear nerve data (see 
above) on this basis (e.g. Pfeiffer § Kim 1973). These neural data 




120 



Evans: AUDITORY FREQUENCY SELECTIVITY AND THE COCHLEAR NERVE 

however can also be accounted for on the basis of an alternative proposal, 
namely that a broadly tuned linear basilar membrane is followed by a 
second filter, with some form of nonlinearity sandwiched in between (vd. 
Evans § Wilson, 1973; see discussion following Pfeiffer § Kim, 1973). 
Alternative interpretations of the basilar membrane data are discussed by 
Wilson in this volume; the various lines of evidence for the second filter 
are outlined in Evans § Wilson (1973). 

The finding by Rhode (1973) , suggesting that the tuning of the basilar 
membrane becomes less sharp after death of the animal, questions one of the 
lines of evidence for the second filter, namely that it is physiologically 
vulnerable. This notion was partly based on the finding that in guinea 
pigs with evidence of cochlear circulatory insufficiency, the FTCs had 
abnormally high thresholds and were as broadly tuned as measurements of 
the guinea pig basilar membrane (Evans, 1972). The physiological 
vulnerability of the second filter has therefore been tested directly in 
a recent series of experiments (Evans 1974a, b; Evans § Klinke unpublished 
observations), in which the effects on the cochlea of hypoxia, cyanide and 
Frusemide (an ototoxic diuretic) have been investigated. 

The activity of single fibres in the cochlear nerve was recorded in 
pentobarbitone anaesthetized cats. (Full details of the techniques in 
Evans, 1972). The frequency threshold (tuning) curves and data 
illustrated in Figs. 1-5 are based on rapid, automatic determinations of 
the activity evoked by 60msec tone bursts of randomized frequency and 
intensity. The tones were shaped and presented at 5/sec into a closed 
sound system. Although stimulus levels are given in relative terms of 
electrical input to the condenser driver, they represent approximate 
(+8-2) dB SPL at the tympanic membrane. 

(a) Hypoxia . 

Fig. 1 shows FTCs determined before and during a 4 minute period of 
cochlear hypoxia produced by diluting the inspired O 2 concentration to 5% 
with N 2 O (Evans, 1974a, b) . After a latent period of approximately 3 
minutes, cochlear hypoxia develops, as shown by the gross cochlear AP 
recorded from the round window in response to intermittent click stimuli 
of constant amplitude. During this period, the FTC progressively loses 
its low threshold sharply tuned *tip*, until after about 2 mins of cochlear 
hypoxia the high threshold, broadly tuned (dotted) curve remains. The 




121 



Evans: AUDITORY FREQUENCY SELECTIVITY AND THE COCHLEAR NERVE 



cochlear microphonic (also recorded from the round window to the click 
stimuli: CM) is affected only slightly. It was possible to hold some 
fibres long enough to observe recovery of the low threshold sharply tuned 




Fig. 1. 



10 20 
Tone frequency (kHz) 

Effect of hypoxia on FTC 



BP 



CH 

AP 




Tone frequency (kHz) 

Fig. 2. Reversible effects of hypoxia on FTC 

segment of the FTC (Fig. 2) . The inset again shows the times of 
determination of the FTCs in relation to changes in the gross cochlear AP. 
Similar effects were observed in 13 fibres in 13 consecutive periods of 
hypoxia in one cat. 




122 



Evans: AUDITORY FREQUENCY SELECTIVITY AND THE COCHLEAR NERVE 



(b) Cyanide 

-3 

KCN, in concentrations o£ about 10 M was instilled directly through 

the round window in cats. The 
changes shown in Fig. 3 were found 
in fibres with CFs above 7kHz. 

Loss of the low threshold sharply 
tuned segment (from A to C) and 
near total recovery (C to F) could 
be obtained, again without 
substantial effects on the cochlear 
microphonic. 



3-5 7 ]k 28 

Tone frequency (kHz) 




Fig. 3. Reversible effects of intracochlear KCN on FTC 
(c) Frusemide . 

This was injected into the vertebral circulation via the subclavian 
artery, and the effects on the gross cochlear AP, CM and the FTC are shown 
in Fig. 4. Again, reversible transient loss of the tip of the FTC can be 
obtained (A to C, to F) . 

In each of the above cases. 



lOOr 



1 r 

klOmg Frusemide i.a. 




6.75 13.5 27 

Tone frequency (kHz) 



reversible loss of 30-40 dB of the 
low threshold sharply tuned segment 
of the FTC occurred, leaving behind 
the broadly tuned, high threshold 
segment with a CF consistently 
shifted downwards in frequency. 
These effects are much greater in 
magnitude and occur over a much 
shorter time scale than those found 
by Rhode (1973) . Furthermore, the 
fact that the cochlear microphonic 
is substantially unchanged during 



the time in which these drastic 



Fig. 4. Reversible effects of Frusemide 
i.a. on FTC 





123 



Evans: AUDITORY FREQUENCY SELECTIVITY AND THE COCHLEAR NERVE 

changes occur in the FTC (Figs. 2 and 4) is itself important evidence 
that the basilar membrane properties remain relatively unchanged (on the 
assumption that any distributed changes in cochlear mechanics in cases (a) 
and (c) would be reflected in the round window CM) . 

Some of the many mechanisms proposed to account for the second 
filtering process are outlined elsewhere (Evans § Wilson, 1973) . An 
interesting additional possibility involving fluid coupling has been 
proposed by Steele (1973) . Another possibility, suggested by the data of 
the hypoxia experiments above, needs to be considered seriously (Evans, 
1974b) . This is the notion that the two segment FTC is generated by two 
processes related to the inner and outer cells and/or their 
innervations, respectively. It depends upon the demonstration by 
Spoendlin (e.g. 1972) in the cat that 95% of cochlear fibres innervate 
inner hair cells . Firstly, Kiang et al (1970), recording from the 
cochlear nerve of cats poisoned with an ototoxic antibiotic, were unable 
to obtain the normal low threshold responses in regions of extensive 
outer hair cell loss from fibres innervating inner hair cells which were 
(to light microscopy at least) normal in appearance. Secondly, maximal 
crossed olivocochlear inhibition of cochlear fibres (which selectively 
attenuates the sharply tuned segment) occurs in fibres whose CFs 
correspond with regions of maximal density of efferent innervation of 
outer rather than inner hair cells (Wiederhold, 1970). Thirdly, 
measurements of the discharge rate versus tone intensity functions show 
that for many (but not all) cochlear fibres, the rate function for the 
more sharply tuned segment has a lower and more complicated slope than 
that for the high threshold low frequency ’tail* (Fig. 5). With the 
actions of hypoxia (e.g. dotted rate functions in Fig. 5), cyanide and 
Frusemide on the cochlea, the form of the rate function at the CF 
progressively approximates to that of the high threshold segment. 

These findings are consistent with two separate excitation mechanisms 
being responsible for the two segments of the FTC. That the outer hair 
cells might be responsible for the low threshold sharply tuned segment is 
suggested by their greater susceptibility to ototoxic agents, and the 
finding that under the conditions of (a), (b) and (c) above, the FTCs 
shift towards lower frequencies. Interaction between the inner and 
outer hair cell systems could occur by action potentials propagated along 




124 



Evans: AUDITORY FREQUENCY SELECTIVITY AND THE COCHLEAR NERVE 



the outer spiral fibre interfering with or initiating discharges in the 
initial segment of the inner radial fibres, where they come into close 
apposition (e.g. Spoendlin, 1972). There is indeed evidence of 
substantial differences between the responses of the two sets of 
receptors (vd. Dallos 1973; Karlan et al 1972). It is unfortunate for 
this speculation, however, that there is closer correspondence between the 
basilar membrane filtering properties and those of the outer hair cells 
rather than those of the inner hair cells (Dallos, 1973). This would 




5 10 20 ii0 



Tone frequency (kHz) 



require the * second filter* to be 
located within the outer hair 
cells and/or their innervation. 
Clearly the need is for more data 
before the roles of the hair cells 
their innervations and that of 
other structures such as the 
giant fibres linking the inner 
hair cells (Spoendlin 1972) can be 
established. 

The above considerations also 
encourage the view that the 
elevation of threshold and the 




widening of the critical 
band found in cochlear 
deafness may be due to 
progressive loss of the 
sharply tuned segment of 
the cochlear fibre FTCs 
(vd. Evans § Wilson, 
1973; Evans 1974b). 



Tpne level (dB) 





125 



Evans: AUDITORY FREQUENCY SELECTIVITY AND THE COCHLEAR NERVE 



Appendix 

To a first approximation, the frequency filtering properties of 
cochlear fibres, represented in their mean discharge rate, can be 
considered as produced by a linear filter followed by a frequency 
independent non-linearity responsible for the saturation properties of 
the nerve fibres (e.g. de Boer, 1970; Evans § Wilson, 1973). That this 
non-linearity is not entirely independent of frequency and other factors 
is shown by the following preliminary data, which are relevant to the 
articles by ZWICKER and by SCHROEDER ^ HALL in this volume. 

The data were obtained under the conditions outlined on p.3. It 

Fig. 6. Discharge rate : 
intensity functions of cat 
cochlear fibre of CF: 1.05 
kHz Upper plot: frequency 
response plot: length of 
line indicates mean number 
of spikes evoked by a 60 
msec tone burst of 
frequency and intensity 
indicated by the location 
of the centre of the line. 
(Calibration in upper right- 
hand corner : 10 spikes) . 
Mean of 5 stimulus presen- 
tations. Lower plots: 
discharge rate versus 
intensity at frequencies 
indicated by symbols and 
letters on upper plot: A-F: 
0.36, 0.53, 0.78, 1.05, 1.4, 
and 1.55kHz respectively. 

should be emphasised that both the frequency and intensity of the tone 
bursts were randomised in order of presentation. 

A. Dependence of discharge rate -intensity functions on frequency . 

Figs. 6 and 7 show plots of discharge rate versus tone intensity for 
two cochlear fibres representative of fibres of low (< 2kHz) and high CF 
respectively. Rate functions for frequencies above the CF are generally 






126 



Evans: AUDITORY FREQUENCY SELECTIVITY AND THE COCHLEAR NERVE 




Fig. 7. Discharge rate- 
intensity functions of cat 
cochlear fibre of CF: 14.5 
kHz As Fig. 6. Mean of 4 
stimulus presentations. A-E 
5.7, 8.1, 12.0, 14.5, 17.7 
kHz respectively. 



less steep and saturate at lower discharge rates than those at the CF 
(Fig. 6: E, F, cf.D; Fig. 7: E, cf.D). For frequencies less than the CF, 
the rate functions in some high CF units and in most low CF units are less 
steep than at the CF (Fig. 6: A, B, C) , with the exception of frequencies 
corresponding to the low frequency high threshold ’’tail" of the FTCs of 
high CF fibres, where the slopes in almost all cases are substantially 
greater than at the CF (Fig. 5: LF; Fig. 7: A, B) . For many low CF fibres, 
the discharge rate for frequencies below the CF saturates at a lower rate 
than at the CF (Fig. 6A, B, C) . While this finding is in the direction 
predicted by the model of SCHROEDER ^ HALL, it is not clear whether the 
prediction would hold for frequencies as high as 0.78kHz (Fig. 6C) . 

B. Dependence of FTCs and rate intensity functions on level and duration 
of background noise . 

Fig. 8 illustrates the effects of successively lOdB higher levels of 
continuous white noise on the frequency response of a cat cochlear fibre. 
(Noise power in 40kHz bandwidth equal to tone power) . B-E respectively 




Discharge rate £spi kes/SOm^ec tone tjurst) 



cochlear fibre . Plots as in Fig. 6. A: no noise. B-E: noise at +50, 60 
70, and 80dB respectively (see text) . 




Fig. 9. Discharge rate-intensity functions of fibre of Fig. 8 at CF . 
A: no noise. B-E: noise at +50, 60, 70 and 80dB respectively. 








128 



Evans: AUDITORY FREQUENCY SELECTIVITY AND THE COCHLEAR NERVE 

represents levels o£ 5, 15, 25, and 35dB above approximate noise 
threshold. A is response in absence of noise. 

In all cases where this has been carried out, the tip of the FTC was 
elevated approximately lOdB with each change in level above threshold. 
However, it appears that the elevation in threshold is less at 
frequencies corresponding to the low frequency tail than at the CF. The 
lOdB bandwidth does not increase until tip threshold elevations of more 
than 30dB occur. 

A curious and unexpected consistent finding in these experiments was 
a reduction in the discharge rate evoked by the tones with higher levels 
of continuous noise. This is evident in Fig. 8, but is more clearly 
shown in Fig. 9 by the rate functions, obtained from the same fibre at 
the CF, corresponding to A-E in Fig. 8. Fig. 9 also demonstrates a 
general finding that the response saturates at progressively lower rates 
with increase in level of the continuous noise. These findings are not 
obtained if the noise is gated and presented only simultaneously with 
the tones, i.e: in 60 msec bursts. These effects must therefore 
represent a decrease in "gain” and/or reduction in saturation rate with 
time of exposure to continuous noise. This is also reflected in the 
restricted range of discharge rates in the absence of tone ("SPONT”) 
under continuous compared with intermittent noise presentation. While 
the boundaries of the FTCs appear to be very similar under continuous 
and intermittent noise conditions, it is clear that continuous noise 
exerts substantial time -dependent effects on the non-linear transduction 
mechanism. 

References 

Dallos, P. (1973). In Basic Mechanisms in Hearing . pp335-372. Academic 
Press, N.Y. 

De Boer, E. (1970). In Frequency analysis and Periodicity Detection in 
Hearing . pp204-212. Leiden: Sijthoff. 

Duifhuis, H. (1971). J. Acoust. Soc. Am. 49: 1155-1162. 

Evans, E.F. (1972). J. Physiol . 226 : 263-287. 

Evans, E.F. (1974a). J. Physiol . 221 In Press. 

Evans, E.F. (1974b). Audio 1 . In Press. 

Evans, E.F. (1974c). Cochlear nerve and cochlear nucleus. In Handbook of 
Sensory Physiology . Vol. V Part II. In Press. 

Evans, E.F. § Wilson, J.P. (1971). Proc. 7th Internat. Cong, on Acoustics 
_3: 453-456. Budapest: Akedemiai Kiado. 




129 



Evans: AUDITORY FREQUENCY SELECTIVITY AND THE COCHLEAR NERVE 



Evans, E.F. § Wilson, J.P. (1973). In Basic Mechanisms in Hearing pp519- 
551. Academic Press, N.Y. 

Goldstein, J.L. (1972). In Hearing Theory, 1972. ppl86-208. Eindhoven: 
IPO. 

Karlan, M.S., Tonndorf, J. | Khanna, S.M. (1972). Ann. Otol . 81 : 696-704. 
Kiang, N.Y-s, Moxon, E.C. § Levine, R.A. (1970). In Sensorineural 
Hearing Loss pp241-268. London: Churchill. 

Miller, A.R. (1970). Acta Physiol. Scand . 78: 299-314. 

Pfeiffer, R.R. ^ Kim, D.O. (1973). In Basic Mechanisms in Hearing . pp555- 
587. Academic Press, N.Y. 

Rhode, W. (1971). J. Acoust. Soc. Am . 49: 1218-1231. 

Rhode, W. (1973). In Basic Mechanisms in Hearing . pp49-63. N.Y. : Academic 
Press. 

Rose, J.E., Hind, J.E. Anderson, D.J. 6 Brugge, J.F. (1971). 

Neurophysiol. 34: 685-699. 

Small, A.M. (1959). J. Acoust. Soc. Am . 1619-1625. 

Spoendlin, H. (1972). Acta Otolaryngol . 73 : 235-248. 

Steele, C.R. (1973). In Basic Mechanisms in Hearing . pp69-90. N.Y. 

Academic Press. 

Wiederhold, M.L. (1970). J. Acoust. Soc. Am . 4^: 966-977. 

Wilson, J.P. § Evans, E.F. (1971). Proc. 7th Intemat. Cong, on Acoustics . 

£: 397-400. Budapest: Akedemiai Kiado. 

Zwicker, E. (1971). Proc. 7th Intemat. Cong, on Acoustics , 1^: 189-192. 
Budapest: Akedemiai Kiado. 

Zwicker, E. | Fasti, H. (1972). J. Acoust. Soc. Am . 52 : 699-702, 



130 



COMMENTS ON: "Auditory Frequency Selectivity and the Cochlear Nerve" 

(E.F. EVANS) 

J. SCHWARTZKOPFF 

Lehrstuhl fur Allgemeine Zoologie, Ruhr-Universitat, Gottingen, FRG 

My co-workers R. Necker (Z. vergl. Physiol. 367, 1970) and G. Kauf- 
fmann (J. Comp. Physiol., 1974, in press) have studied the cochlear 
microphonics (CM), and less thoroughly EP, SP and AP, in birds and 
reptiles by practically the same methods as Dr. Evans did. The findings and 
their interpretation have been presented partially at the Stockholm Sym- 
posium of 1972 and partially at the Cochlea-Symposium at Halle last week. 
The facts are: 

Like in mamnals, also in our experimental animals, a rarefaction click in- 
duces as first answer a positive deflection of CM, called by us CM^. . 

This component is eventually responsible for the phase-locked excitation 
of the auditory nerve. A condensation click produces an initial CM. v/hich 
inhibits nervous excitation. If a sinusoid signal is presented to a 
bird's ear, a sinusoid CM is produced. After 60 sec of N 2 respiration, 
the negative half-wave (CM.) is completely suppressed. This means com- 
plete rectification of the CM. In principle, this has been reported by 
Riesco-McClure, J.S., Davis, H., Gernandt, B.E. and Coveil, W.P. in 1949 
(Proc. Soc. exp. Biol. N.Y. 71, 158) from the guinea pig. Necker has 
shown in birds too, that under hypoxia partial rectification occurs, as 
Dr. Evans has demonstrated. Click stimuli combined with anoxia have the 
same consequences. We applied pairs of clicks with opposite initial dis- 
placement separated by 10 msec. If the condensation click comes first, 

CM__ appears first, and CM^. can be seen at the beginning of the following 
round window response. After 50 sec of N 2 respiration CM_, is completely 
suppressed while CM+ does not change essentially. The evaluation of the 
time course of the effects of short-time anoxia demonstrates complete 
reversibility. AP and EP, which also were registrated in many experiments* 
behave similarly, while SP may increase, however some seconds later. 

Cyanide has roughly the same effects as anoxia. If the perfusion concen- 
tration is chosen carefully, the depression of CM- is reversible, too. 
Cooling of the cochlea leaves untouched, while CM- decreases by a 
Qlo of about 2.0 (38 to 280C). The results reported seem to indicate that 
the "second filter" may be coupled with the metabolism of the hair cells. 




131 



ADDITIONAL REMARKS 

SPOENDLIN: There seems to be a good correlation between the effects of 
anoxia on the frequency threshold curve of the neurons you recorded and 
the structural changes we observe in the radial dentrites to the inner hair 
cells under short anoxia. After 2-3 minutes of anoxia these dentrites swell 
enormously and selectively whereas the fibres to the outer hair cells remain 
unchanged. These swellings are probably reversible when anoxia does not 
last too long. The possibility has to be kept in mind, that the loss of the 
highly tuned portion of the frequency threshold curve under hypoxia could 
be explained with the inner hair cell system alone. 

WILSON: A model of cochlear filtering based on interaction between inner 

and outer hair cell outputs is very attractive. It would, however, appear 
that the specific effects that you observe on the low threshold sharp seg- 
ment of the FTC might be due to damage to the inner hair cell component of 
the interactions rather than the outer as you suggest. Bekesy (1960) 
observed with methylene blue dye that inner hair cells were the first to die 
under anoxia. Spoendlin (this volume) observed that inner radial fibres 
swell rapidly with onset of hypoxia whereas outer radial fibres do not. The 
shape of the FTC*s under acute hypoxia or ototoxic poisoning resemble 
basilar membrane displacement which would not be so if inner hair cells 
responded to velocity, longitudinal shear or some other complex function 
of the stimulus. Finally, there is reason to believe that long and short 
term effects of ototoxic agents are quite different because the former 
results in loss of cochlear microphonic whereas the latter need not. 

EVANS: I do not find your objections conclusive. On the contrary, attempts 
to correlate the low threshold sharply tuned segment with IHC rather than 
OHC function have to face two problems :(1) according to CM data (Dallos, 
et al . ) the OHC are the more sensitive; (2) the shift in CF with loss of the 
sharply tuned segment of cochlear fibres is towards lower frequencies. 



EVANS: Evidence that the apparent loss of sharp tuning is not due to 
nonspecific effects of the agents on hair cell sensitivity or spike 
generation mechanisms: (a) threshold elevation occurs only for the low 
threshold sharply tuned segment — the low frequency tail is unchanged; 
(b) under favourable conditions non-specific depression of responsiveness 
and spontaneous activity can be demonstrated but these invariably occur 
after the specific effects (i.e. loss of the low-threshold sharply tuned 
segment); (c) instilling tetrodotoxin into scala tympani produces, as 
expected, "non-specific" reduction and elimination of spike generation 
without changes in the FTC (Evans and Klinke, unpublished data). 




132 



ON A PSYCHOACOUSTICAL EQUIVALENT OF TUNING CURVES 
EBERHARD ZWICKER 

Institut fiir Elektroakustik, Technische Universitat 
Miinchen, FRG 



INTRODUCTION ; When measuring tuning curves of single units in 
the auditory nerve (e.g. KIANG et al. 1965, KATSUKI 1966, 

EVANS 1972) the sound pressure level of a tone burst is 
increased until the spike rate of a single fibre reaches a 
particular value just above the spontaneous activity. This 
value is chosen more or less arbitrarily, but has to be a 
constant response criterion, for example 1 spike per tone burst 
(KIANG et al. 1970). The sound pressure level of the tone 
burst necessary to reach this criterion as a function of its 
frequency is called the tuning curve. The SPL of the tone, its 
frequency f, the response criterion and the single fibre are 
the 4 values involved in the production of such tuning curves. 
To produce corresponding curves in psychoacoustics , equivalents 
to these 4 values have to be found. Level of the tone (burst) 
and its frequency f can be identical values whereas the equiv- 
alent of the response criterion can be realized in some kind 
of a threshold. However, a single fibre has no direct psycho- 
acoustical equivalent, but a very faint sinusoidal tone with a 
frequency corresponding to the characteristic frequency (CF) 
of the fibre may be a good approximation, although the spatial 
extension of a single fibre is always "narrower" than the ex- 
citation produced by a faint tone. 

Consequently, the psychoacoustical equivalent may be measured 
as threshold of a very faint pure tone (called CF-tone) at the 
characteristic frequency, masked by another tone (called f- 
tone) with the frequency f. The sound pressure level of the 
f-tone necessary to just mask the CF-tone as function of the 
frequency f represents the psychoacoustical tuning curve. 




133 



Zwicker: TUNING CURVES 

PROCEDURE: (1) Set the CF-tone to a small sound pressure level 

(corresponding to for example SL=5dB) and keep it constant du- 
ring the measurement. (2a) Vary the frequency f of the addition- 
al f-tone and its sound pressure level SPL so that the CF-tone 
is just masked. (2b) When using a B^;K^ISY-Type audiometer (meth- 
od of tracking), the SPL of the f-tone (the masker i) can be 
varied by the observer, while the frequency of the f-tone (the 
masker!) changes slowly. This way, the psychoacoustical tuning 
curve can be recorded directly (see Fig. 2) . 

All measurements were performed monaurally using earphones and 
an equalizing network as described in ZWICKER and FELDTKELLER 
(1967). The CF-tone is switched on and off for 600 ms, respec- 
tively. 

RESULTS I a) Effect of sensation level of CF-tone . Fig. 1 shows 
the SPL of the f-tone necessary to just mask a CF-tone at 
different sensation levels (SL=2dB, 5dB, 25dB and 45dB; CF = 

2 kHz) as function of its frequency f. Measurements at SL=2dB 

are very diffi- 
cult, since even 
without masking 
by the f-tone the 
CF-tone is heard 
part of the time 
only. For SL=5dB 
the result is not 
much different but 
much more repro- 
ducable. Around 
the characteristic 
frequency beats 
make reliable 
measurements diffi 
cult. If the SL is increased further (SL=25dB, 45dB) , the con- 
tour changes not only its shape, but in addition combination 
tones produced by the hearing system’s nonlinearity become audi 



80 



40 



20 



050 200 400 600 Hz 1 2 3 4 5 6 7 8 W kHz 20 

20 WO ^ f 



Fig, 1 : SPL of a pure tone (ordinate) which 
just masks a 2kHz-tone of different SPL’s 
and SL’s respectively (parameter), as func- 
tion of its frequency f. 



t 

I 60 

SPL 



[ITv 
















T 


T~ 


















III 












■1 


m 


■ 


IHI' 




■II 


■1 




“I IF CF=2kHz 














"N 


■ 






i 


lllllllllllll 












-SPL SL 








1.1 




zz 








■ 






im 


■ 














IF 










N 






■ 


iim 










III 


1 CAMD / tZAn 








Hi 


§ 


















L 




k: 




EHI 


■ l 








LL 
















9 


1 




" 


s 








mi 






■ 


■II 










■ 




?! 
















81 


m 






■ 


ill 
















5 


i 










5 




■ 


31dB 25dB 




■ 


ill 












n 




□ 


m 














T 
















[■I 


■ 




■ 


■ 


m 














ir 






















□ 




m 




















■ 


ill 


— 




■ 


■ 




■ 


■ 








IX 


1 








■ 


III 


■ 


■ 


■ 


■ 


1 










K1 


wy 




1 luD Dud 






HI 




■ 


!■ 


iim 




■ 


■ 


■ 


■ 




llll 


11^ 


■ 


M 


Ql 




~ 8dB 2dB 




■ 


III 




■ 


!■ 


!■ 


8 


■ 


■ 


m 


■ 




IB 


■ 


■ 


■ 




■ 






jl 


!!! 







Zwicker; TUNING CURVES 



ble and influence the results as was described also by SMALL 
(1959) who measured at SL=15dB and 30 dB. Therefore/ what is 
called psychoacoustical tuning curve in this paper is measured 
with a CF-tone at 5 dB sensation level. 



b) Normal tuning curves. Fig. 2 represents original recordings 




of tuning curves 
for 3 different 
characteristic fre- 
quencies but the 
same Subject 
(H.F.r.). The fre- 
quency f is plotted 
on a nonlinear 
scale which pro- 
duces a linear divi 
Sion of the criti- 



Fig . 2 : Original recordings of psychoacous- 
tical tuning curves ( SL=5hB ) for character- 
istic frequencies of 630 Hz, 2kHz and 8kHz» 
abscissa: frequency f (lower scale) and 
critical hand rate (upper scale), respec- 
tively . 



cal band rate scale 
The contours show 
a ”V” shape for low 
characteristic fre- 



)l 


\ , 




1 




i- 








V 


VSr»)\ 




J 


h 






3 

r\ 


A 




7 




‘ 1 






J 




\ 




\ 

M92-2p 




V 

M92-23 


0 

0 




M92-42 











"tail” towards low 
frequencies which 
increases drasti- 
cally for high 
characteristic fre- 
quencies. The simi- 
larity between psy- 
choacoustical tun- 
ing curves (Fig. 2) 
and computer pro- 
cessed tuning 
curves of single 



Fig. 3 : Computer processed tuning curves 
from single auditory nerve fibres (cat) of 
different characteristic frequencies ( KIAHG 
and MOXON, 19T3)* (These original contours 
are received with thanks from Dr.H.KIANG). 



auditory nerve fi- 
bres of the cat 
(Fig. 3) is strik- 




135 



Zwicker: TUNING CURVES 




050 200 400 600 Hz 1 2 3 4 5 6 7 6 10 kHz 20 



20 WO f 

. h: Averaged psy choacoustical tuning 
curves (solid: mean- of k subjects) for the 
same characteristic frequencies as in Fig. 2 
For CF=2kHz, the "narrowest" (dotted) and 
the "broadest" (dashed) contour out of the 
sample are indicated. 



ing: With a fre- 
quency shift of a 
factor 1.8 (cat-hu- 
man) they become 
almost identical I 
Although the psy- 
choacoustical tun- 
ing curves differ 
typically from sub- 
ject to subjectrthe 
differences are not 
as much as to in- 
terdict averaging. 
Fig . 4 shows the 



average contours 



(mean of 4 subjects, E.Z.r. not included) for the same 3 char- 



acteristic frequencies as in Fig. 2, and 2 individual curves. 
Although there are individual differences, the averaged psycho- 
acoustical tuning curves show the same striking similarity with 
the physiological ones as the curves of a single observer, 
c) Effects of additional masking sounds. Tuning curves with ad- 
ditional masking have been measured by KIANG and MOXON (1973) . 
Using broad band noise or narrow band noise centered at the 



characteristic frequency, their results show a parallel shift 
of the whole tuning curve towards higher levels. This effect 
has been "reproduced" in psychoacoustical tuning curves for 
broad band noise (Fig. 5a) as well as for narrow band noise 
(Fig. 5b) centered at 2 kHz, the frequency of the CF-tone. As 
consequence of the masking the psychoacoustical tuning curves 
are shifted upward almost parallel toward higher levels. The 
level of the CF-tone has to be raised by 20 dB and 40 dB, re- 
spectively, because of the masking effect, but the sensation 
level - here level above masked threshold - is kept constant at 
SL=5dB. These "masked psychoacoustical tuning curves" can be 
approximated quite well by the curves measured without masking, 
and the corresponding CF-SPL*s of 31 dB and 51 dB, respectively 



Zwicker: TUNING CURVES 





Fig. 3 « Psychoacoustical tuning curves ob- 
tained during additional masking with a) 
white Noise ( WN ) the density level IWN of 
with is given, b) narrow band Noise (NBN) 
centered at 2 kHz with the same density 
levels l^^u as in a), c) narrow band noise 
centered at 500 Hz and k kHz, respectively. 
CF is 2 kHz, and 9,T kHz additionaly for 
c ) . 



(Fig. 1). Some 
small differences 
still remain, but 
the unmasked psy- 
choacoustical tun- 
ing curves produced 
at the higher sen- 
sation level are 
good approximations 
in both cases . 
Additional masking 
produced by sounds 
in frequency re- 
gions apart from 
the frequency of 
the CF-tone is more 
effective for 
sounds below, than 
for sounds above 
the CF-tone (see 
ZWICKER and FELDT- 
KELLER 1967). KIANG 
and MOXON (1973) 
gave an example for 
a fibre with very 
high characteristic 
frequency of about 
19 kHz and an addi- 
tional narrow band 
noise centered at 



1 kHz. The tuning curve remained almost unchanged; only near 
the CF it was shifted towards higher levels. The corresponding 
psychoacoustical tuning curves for CF = 2 kHz with an additio- 
nal narrow band masker centered at 500 Hz are shown in Fig. 5c 
For medium level of the masker the tuning curve is changed 
mostly at frequencies around the frequency of the masker. For 




137 



Zwicker: TUNING CURVES 

high levels^ however / the masker influences additionally the a- 
rea around the CF. In this case (Fig. 5c) the nonlinearity of 
the masking effects (an increment of 8 dB of the masker pro- 
duces 20 dB increment of the threshold) seems to be more in- 
volved than in the configurations of Fig. 5a und 5b. This non- 
linear behaviour of the upper slope of the masking pattern 
seems to produce the dominant effects in this case of additio- 
nal masking. 

The influence of the characteristic frequency was tested with a 
subject (H.F.r.) who showed not only narrow psychoacoustical 
tuning curves , but also narrow ordinary masking patterns with 
steep slopes. The contours (Fig. 5c) measured by this subject 
using a CF-tone at 9,1 kHz without and with additional masking 
narrow band noise at 4 kHz indicate that additional masking 
produces similar results for another subject and another CF. To 
avoid the influence of the stapedius muscle reflex, levels of 
the masker much above 80 dB have not been used. 

DISCUSSION ; The psychoacoustical tuning curves become narrower 
and show steeper slopes with increasing characteristic frequen- 
cy. This effect becomes very pronounced if the tuning curve 
without additional masking (SPLs--oo)of subject H.F.r. (Fig. 5c) 
is compared with the set of tuning curves shown in Fig. 2. Or- 
dinary masking patterns, masked by narrow band noises show this 
tendency too. When using tones as maskers instead of narrow 
band noises , the conventional masking patterns become even more 
narrow since their slope at low frequencies is shifted for 
about ohne half critical band toward higher frequencies (MAI- 
WALD 19 67). 

Although a faint sinusoidal tone (SL=5dB) may stimulate more 
than one single auditory-nerve fibre, the psychoacoustical tun- 
ing curves of human show great similarity to the physiologi- 
cally obtained tuning curves of single auditory-nerve fibres in 
cats . The asymmetrical shape of the tuning curves of animals as 




Zwicker: TUNING CURVES 



well as the shape of their psychoacoustical equivalent in human 
correspond to the fact that ordinary masked thresholds have a 
long "tail" (with a distance to threshold in quiet of only a few 
dB) toward high frequencies before they reach threshold in 
quiet. An example is given in Fig. 6, where masked and unmasked 
thresholds of the subject E.Z.r. (aver age- curves of the" track- 
ing bands" produced by BfeKfiSY-method) are plotted. The dotted 
line (produced by a 105 Hz - 65dB masker) shows a distinct 
"tail" between 1.8 kHz and 3.3 kHz. Towards low frequencies. 




050 200 <00 600 Hz 1 

20100 



Fig . 6 : Conventional masked thresholds of 



a pure tone as function of its frequency 
f. Masker: another pure tone of given fre- 
quency and sound pressure level . Thre shold 
in quiet is indicated, too. (Subject E.Z.r.) 



i.e. for high fre- 
quency maskers, how- 
ever , the masked 
thresholds do not 
show such tails, but 
fit into the thresh- 
old in quiet almost 
immediately even 
for masker levels 
y 6 7 6 to kHz 2 o^P to 80 dB . The 

counterparts to the 
tuning curves and 
especially the 
"tails" shown in 
Fig. 6 indicate on 



the other hand that tuning curves as they are measured with cri' 
terions very near threshold (physiologically or psychoacousti- 
cally) are too close to the spontaneous activity and too close 
to threshold in quiet, respectively, to be representative for 
the frequency selectivity of the hearing system in general. Or- 
dinary masked thresholds and their - to be measured - physiolo- 
gical equivalent or iso-rate contours seem to be more relevant. 
Neverthless, a gap between the tuning curves in animals and the 
ordinary masked thresholds in man seems to be filled up by 



means of the described psychoacoustical tuning curves. Slopes 
of more than 100 dB/oct can be measured in psychoacoustical tun 
ing curves as well as in ordinary masked thresholds , especially 




139 



Zwicker: TUNING CURVES 

at’ high frequencies and with pure tone maskers. This means that 
the frequency selectivity of the hearing system as measured by 
psychoacoustical masking patterns, i.e. in a very late stage, 
is already existent in auditory nerve fibres, i.e. in a very 
early stage. Therefore it is most likely that the hydromechani- 
cal system within the cochlea together with the additional 
hydromechanical system within the scala media (see ZWICKER 
1974 a, b) produces most if not all of the frequency selectivi- 
ty in hearing. 



ACKNOWLEDGEMENT: This research was carried out within the Son- 
derforschungsbereich Kybernetik Miinchen, supported by the Deut- 
sche Forschungsgemeinschaf t . 



LITERATURE : 

Evans 5 E.F. (19T2). "Does Frequency Sharpening occur in the 
Cochlea", in: "Symposium on Hearing Theory" 

(IPO, Eindhoven, Holland) 2T-3^. 

Katsuki, Y. (I966). "Neural Mechanism of Hearing in Cats and 

Monkeys", in: "Progress in Brain Research, Part A: Fun- 
damental Mechanism", T. Tokizane and I.P. Schade , Eds. 
(Elsevier, Amsterdam) 91^-97. 

Kiang, N.Y.S., Moxon , E.C., and Levine, R.A. (19T0). "Auditory- 
Nerve Activity in Cats with Normal and Abnormal Cochleas" 
in: "Sensorineural Hearing Loss", G.E.W. Wolstenholme 
and J. Knight, Eds, (J. and A. Churchill, Great Britain) 
21+1-273 . 

Kiang, N.Y.S. and Moxon, E.C. (19T3). "Tails in Tuning Curves 

of Auditory Nerve Fibres", Paper given at v. Bekesy-Sym- 
posium 19T3, to be published in J.Acoust. Soc.Am. 

Maiwald , D. (1967)* "Beziehungen zwischen Schallspektrum , Mit- 
horschwelle und Erregung des Gehors", Acustica 18,69-80. 

Small, A.M.Jr. (1959)* "Pure-Tone Masking", J . Acoust . Soc . Am . 3 1 . 

1619-1625. 

Zwicker, E. und Feldtkeller, R. (I967). Das Ohr als Nachrich- 
tenempf anger , 2nd Ed. (S. Hir zel-Verlag , Stuttgart). 

Zwicker, E. (l9T^a). "Ein hydr odynami s ches Aus schnittmodell des 
Innenohres zur Erforschung des adaquaten Reizes der Sin- 
neszellen", Acustica, in press. 

Zwicker, E. (l9T^b). "Spaltweite und Spalt stromung in einem 
Aus s chnittmodell des Innenohres", Acustica, in press. 




140 



Zwicker: TUNING CURVES (APPENDIX) 



In Fig. 4 of the paper by KIANG and MOXON (1974) the "thresh- 



old” with respect to a 1-kHz tone is given for units of differ- 
ent characteristic frequency (CF) . The open symbols in Fig. 7 
represent these data of "ordinates of physiological tuning cur- 
ves read at 1-kHz”. The corresponding psychoacoustical data can 
be produced by masking a just audible CF-tone (SL=5dB) at vari- 
ous frequencies by a 1-kHz masker tone. The sound pressure lev- 



200 300 400 500 TDOHzl 15 2 3 4 5 7 10 15 kHz 

— I 1 — I — I 1 1 1 — I 1 — I — I 1 1 1 



1 


— j_ 

' 


1 1 

r 


— 


— 


i 1 — 

1 

j 


— 1 1 — 1 1 1 1 




tI 


▲ 






D 




D 




A 






;sl 

0^ 


rsTi 


O 


O 

A 


A 






j 


4 


u 


1 


V 






▼ A HF.r 

© ® E.Z.r 

1kHz 500Hz 








c 




1 1 











200 500 700Hz 1 1.4 2 3 5 10 20 kHz 50 

tF ^ 



el of this 

tone, necessary to 
just mask the CF- 
tone as function 
of its frequency 
should produce 
equivalent re- 
sults. Data of 2 
subjects are given 
as filled and 
partly filled sym- 
bols respectively. 



Fi^. T: SPL 1kHz on physiological tuning 

curves (KIANG and MOXON, 197^) at the fre- 
quency of 1kHz (open symbols) for different 
CF*s (abscissa). SPL of a 1-kHz masker tone 
needed to just mask a faint tone (SL=5dB) 
of different frequencies CF ( abscissa ). Cor- 
responding results for a 500-Hz masker tone 
belong to the upper abscissa. 



Two different sets 
of data have been 
produced. One with 
the 1-kHz tone as 
masker , the other 
with a 500-Hz tone 
as masker in order 



to take the frequency range relation for human versus cat into 
consideration. The 500-Hz masker data should be read using the 
upper scale of the abscissa. 

Although the results of human seem to show a steeper slope at 
low frequencies , the agreement between physiological and psy- 
choacoustical data is fairly good pointing again to the fact, 
that the frequency selectivity measured by psychoacoustical 
masking methods is basically effective already in single fibres 
of the auditory nerve. 




141 



ADDITIONAL REMARKS 



LEGOUIX: It seems worthwhile to report that experiments on two-tone 

suppression observed on cochlear microphonics give curves which have 
many similarities with the masking curves presented by Dr. Zwicker. 
.this might be explained by the fact that the decrease of CM is propor- 
tional to the amplitude of the CM response to the suppressing tone and 
so it gives at least an approximate picture of the resonance of the 
basilar membrane movement. These observations may be of some value to 
decide if the tuning is already present in the microphonic response. 
They can help also to explain the mechanism of masking. 




142 



PUKE -TONE MASKING; A NEW RESULT FROM A NEW METHOD 
L.L.M. VOGTEN 

Institute for Perception Research, Eindhoven, The Netherlands 



1 • Introduction 

Exactly half a century ago Wegel and Lane (1924) published the 
first systematic quantitative results of masking with pure 
tones. In subsequent pure-tone masking experiments the fre- 
quency region where probe and masker nearly coincide has been 
treated rather stepmotherly. One reason for this undoubtedly 
is that in this region beats are complicating the picture 
(e.g. Wegel/Lane, 1924, Egan/Hake, 1950, Zwicker, 1967)- As a 
consequence, many authors avoid small frequency differences 
between probe and masker or use noise as a masker. But even 
the application of narrow-band noise does not completely 
avoid intensity fluctuations in the stimulus i.e. masker + 
probe (Bos/de Boer, 1966). 

The present paper deals with experiments on pure-tone masking 
of short tone bursts. Using a phase-locking technique short 
probe durations can be applied without loss of the determin- 
istic character of the stimulus. Especially the frequency re- 
gion, where probe and masker frequency are closely together, 
can now be studied in detail. 

After a description of the used method, we report a few new 
phenomena and compare the outcome with physiological results. 

A full explanation of the new phenomena is not yet available; 
endeavours are continued in order to find a physiological mo- 
del that explains the psychophysical facts. 

2 . Method 

The stimulus used is the sum of a quasi periodically repeated 
tone burst (probe) and a stationary sinusoidal masker (fig. l). 
New in our method is that the probe starts at a fixed phase 
of the masker, independent of the masker frequency f^. So the 
stimulus is always completely defined. Physical parameters 




143 



Vogten: PURE-TONE MASKING 



like envelope, instantaneous frequency, energy difference 
etc. which may be relevant for probe detection, can now be 
calculated exactly as a function of time, masker phase (p , 
and the frequency difference between masker and probe (Vog- 
ten, 1972 ). 




The stimuliis is the Siam of 
masker (stationary sine wave) 
and probe (tone biarst). The 
probe consists of an integral 
niomber of periods l/fp; its 
repetition time Tq (about 
500 ms) equals exactly an 
integral number of masker 
periods l/fm* 



In the present experiments we will keep the probe frequency 

f fixed and vary f . This is in contradistinction to the 
p m 

♦’classical” masking experiments (e.g. ¥egel/Lane, 1924, 
Egan/Hake, 1950, Ehmer, 1959, Zwicker, 196?) , in which the 
role of independent variable is played by the probe frequency 
fp. One of the reasons for keeping f^ fixed is that masking 
of a fixed-frequency probe can be compared more properly 
with physiological data from single auditory nerve fibres. 
This comparison will be discussed in section 6. 

Unless stated otherwise, the probe frequency f is 1 kHz, 
probe duration T is 50 ms, rise/decay time T of the probe 
envelope is 3 ras , (smoothed edges), the repetition frequency 
about 2 Hz, and the masker phase ^ at which the probe is 
switched on, is zero. 

Seated in a sound-insulated booth (lAC 400 a) the subject is 

listening to the stimulus diotically (in some cases monotic- 

ally) by Sennheiser HD 4l4 headphones. The subject adjusts 

the masker frequency f , the masker level L or the probe 

m m 

level so that the probe is just inaudible. The criterium 
is: detection of anything. and can be adjusted in 
steps of 0.3 dB; f^ is continuously adjustable. Frequency 
and attenuator position are remote-printed and the subject 




144 



Vogten; PURE-TONE MASKING 

receives no information on the value adjusted by him. Further 
details are given in Vogten (l972). 

3 • Results for a 1 kHz~probe of 30 ms 

3-1 Iso- curves _£Fig^_22 

For subject LV the adjusted probe threshold levels are 
plotted in fig. 2 for various masker levels. These A- 
shaped curves will be referred to as "iso-L^ curves** in 
order to make a clear distinction between the well known 
classical masking curves and our curves, in which f in- 
stead of f is fixed, 
m 

Generally speaking, we recognize the iso-L^ curves as 
"reversed** masking curves. The slope on the high-frequen- 
cy side does not change very much with respect to and 
amounts to roughly 110 dB/oct. On the low-frequency side 
the slope decreases for increasing L^, from 60 dB/oct to 
about 15 dB/oct at high levels. Similar slopes have been 
found by Egan/Hake (l950), Ehmer (l959), Zwicker ( 1967 ) 
and Greenwood (l97l)* 

Details, however, show marked phenomena in the region of 

small frequency differences between masker and probe. At 

extremely low masker levels (only a few dB masking), the 

iso-L curve is almost symmetrical around 1 kHz. The 
m 

sharp dip at exactly 1 kHz is linked to the physical in- 
teraction between probe and masker. It disappears when 
the phase shift between masker and probe is set to be 
I^TT. Just below the absolute threshold, in-phase addit- 
ion of the masker causes the probe to exceed the thres- 
hold of audibility. So the masker then has a negative 
masking effect (cf Raab et al. 1963» Leshowitz/Raab , 1966 ). 

A new phenomenon is that for a 1 kHz -probe the maximum 
masking does not always occur at f^ = 1 kHz. The frequen- 
cy of the masker exerting maximum masking effect upon 
the probe depends on the intensity of probe and masker. 

The masker frequency at which a fixed frequency probe is 
maximally masked will be abbreviated by MMF. For a weak 




146 



Vogten; PURE-TONE MASKING 

masker, say 35 dB SPL, the MMF is about 60 Hz higher than 
This upward shift with respect to the probe frequency 
will be called “positive MMF shift”. Increasing the mask- 
er level we find the MMF to shift downward. At intermed- 
iate levels (L about 70 dB SPL and L about 45 SPL) , the 
' m p ' 

top of the iso-L^ curves is situated symmetrically around 

f = f , with a small dip at exactly f . A strong masker 
m p P 

(80 dB SPL or more) produces maximum masking at a frequen- 
cy significantly lower than f^: a "negative MMF shift". 

3.2 curves (Fig. 3) 

An amplification of the iso-L^ curves is provided by cur- 
ves showing the masker level L^ just required to mask a 
fixed-level probe. These V-shaped curves are plotted in 
fig. 3 for various probe levels and will be called "iso-L^ 
curve s " . 

The interrelationship between iso-L and iso-L curves is 

pm 

given by the masking surface of fig. 4 in which is plotted 
the probe threshold shift, i.e. the masking, as a function 
of both masker frequency and masker level. 




147 



Vogten: PURE-TONE MASKING 



100 ‘ 

dBSPL 

BO- 

BO 

40 

20 

0 



-20 




■ . . . . . I . ^ I 1 ■ I I > -- 

0.05 0.1 0.2 0.5 1 2 5 10 kHz 20 



Fig. 5s Iso-Lp curves for subject L¥ (solid curves). Diamonds 

indicate the respective probe levels. These are chosen in 
such a way that the probe sensation level (without masker) 
is 10 dB SL at the corresponding fp (arrows). The dotted 
line is the threshold of audibility for the masker. 




Fig. 6s Iso-I^ curves for subject CS (right ear). As in fig. 5» 
the probe sensation level is 10 dB SL, except for the 
1 kHz-probe for which Lp = 5 dB SL. 




148 



Vogtens PURE-TOjp MASKING 

Tlie slope of tlie steepest flank in fig. 3 Is about 

220 dB/oct and almost independent of tbe intensity. These 

results agree with data of Small (1959) i who measured 

iso-Lp curves for 13 and 30 dB masking of a pure-tone 

probe by a pure-tone masker. As in fig. 2, it can be seen 

that the MMP depends on the intensity. The minima are 

shifted with respect to f . 

P 

3*3 

Both fig. 2 and 3 indicate that masking is generally a 

non-linear process. Only when masker and probe frequency 

are exactly equal there is a rather large intensity range 

(L^ between 30 and 70 dB SPL) where the probe amplitude 

increases proportionally to the amplitude of the masker. 

In this range L plotted as a function of L is a straight 
P m 

line with slope one, as Illustrated by the dotted iso-f 

m 

curve in fig. 4 for f =1 kHz. Lower f involves a slope 

m m 

less or greater than one, depending on intensity and on 
frequency, as can be seen from fig. 2. For higher f^ the 
slope is always much less than one. 

4 . Results for other probe frequencies 

The foregoing raises the question whether the MMP shift, as 
found for a 1 kHz-probe , also exists for other probe fre- 
quencies. Fig. 5 and 6 show iso-L^ curves for two subjects, 
fp varying from 0.1 up to 8 kHz. At all probe frequencies 
is chosen in such a way that the sensation level of the probe 
(without masker) amounts to 10 dB. The absolute threshold of 
the masker is indicated by the dotted line. 

Dips and positive MMF shifts occur for almost every probe 
frequency. Above 0.3 kHz the MMF shift is roughly proportion- 
al to the probe frequency. Systematic measurements at var- 
ious intensities have not yet been carried out except for 
1 kHz. 

Inter-subject differences are marked. Subject CS shows much 
sharper and steeper curves than subject LV. Maximum slopes 
are 350 dB/oct for CS and 320 dB/oct for LV. 




149 



Vogten; PURE-TONE MASKING 




Pig. 7 • Upper part : iso-Lp cmnres for 1 kHz - probes of short dtiration 
(6 ms, dotted cinrve) and long diiration (200 ms, solid curve). 

For both probes the sensation level (without masker) is 15 dB SL. 
The stimulus is presented to the left ear only. 

Lower part: the amplitude spectrum of the 6 ms-probe (dotted 
curve) and the spectral envelope of the 200 ms -probe (solid curve). 




Fig. 8: Parts of iso-Lp curves for a 10 ms/l kHz probe (subject L¥) 




150 



Vogten: PURE-TONE MASKING 



The resemblance between these iso-L curves and neurophysiol- 

P 

ogical "tuning curves" will be discussed as well. 

3 . Results for other probe durations 

MMF shifts are found also for other probe durations than 50 
ms. Very short (6 ms) and very long (200 ms) probes provide 
the results shown in fig. 7- For both curves the probe has 
(without masker) a sensation level of 13 dB. So the amount 
of masking is 3 dB higher than for the 30 ms probe of fig. 3- 
As an illustration the measured probe amplitude spectra are 
also shown at the bottom of fig. 7* 

Obviously, the positive MMF shift at low levels is complete- 
ly independent of probe duration. 

From fig. 7 two other facts can be noticed: a) a short probe 
duration goes with a distinct second (local) minimum; the 
1 kHz dip broadens with respect to a long probe, and b) the 
spectrally wider probe is accompanied by a narrower iso-L^ 
curve, although both probes have equal sensation levels 
(both 13 dB masking). 

Fig. 8 shows parts of iso-L curves for a 10 ms probe at var- 
ious levels. At low intensities the MMF is again 1060 Hz, 
whereas at high levels a masker of 870 Hz is most effective. 
Although a MMF shift of -I 30 Hz for = 63 dB SPL is more 
than the -30 Hz shift for L = 60 dB SPL of a 30 ms probe 
(fig- 3), we have to keep in mind that at these high levels 
3 dB difference in probe level causes a substantial shift of 
the extreme. So from fig. 3 and fig. 8 we cannot conclude 
that the negative MMF shift depends on probe duration. On 
the contrary, preliminary experiments not reported here, have 
shown that the negative MMF shift does not depend on probe 
duration . 



6 . Discussion 

The similarity between the iso-L curves (figs. 3 and 6) and 
neurophysiological "tuning curves" from single auditory nerve 
fibres is striking. Chistovich (l97l) already compared tuning 
curves with "equal masking contours" of Small (l959)» The 




151 



Vogten: PURE-TONE MASKING 

marked resemblance suggests a common underlying mechanism 
which may be indicated as follows. 

Addition of a masker of proper amplitude and frequency to a 
probe will cause a change of the activity of fibres original- 
ly excited by the probe only. Suppose that the shift of the 
probe threshold is directly linked up with this change pro- 
duced by the masker. The probe can then be conceived as a 
psychophysical "electrode ” . With the aid of this "electrode” 
the activity produced by a test signal (the masker) is meas- 
ured at a certain fixed place of the basilar membrane. Posit- 
ion and extension of this area depend on both spectrum and 
level of the probe. Therefore, in masking experiments the 
best picture of masker activity is obtained when both spec- 
trum (including the probe carrier frequency) and probe level 
are kept constant. 

In this train of thought psychophysical curves of equal mask- 
ing or iso-Lp curves are comparable with physiological "iso- 
rate functions" in general and, for low probe levels, with 
"tuning curves" in particular. Similarly, the "rate functions" 
or "iso-intensity contours" (e.g. Rose et al. 1971) from 
single fibres form in a way an analogue of our iso-L^ curves. 

If the spike rate of a fibre is plotted as a function of 
both stimulus frequency and stimulus level, then the "res- 
ponse surface" obtained in this way provides a picture sim- 
ilar to the masking surface of fig. 4. Of course we have to 
keep in mind that the neural data concern only one fibre , 
which need not be the case for the psychophysical masking 
data. 

Nevertheless, in search of explanations for top shifts as 
found in the experiments, it might be relevant to mention 
some similar shifts occurring in the physiological literature. 
Shifts corresponding to our negative MMF shifts for high 
sound levels are discernable in 

- Rose et al. (l97lJ fig. 2B) , on a single auditory nerve 
fibre with a best frequency CF of 2.1 kHz, 

- Honrubia/Ward (1968 ; fig. 3 and 6), on the distribution 




152 



Vogten; PURE-TONE MASKING 

of cochlear mlcrophonics along the cochlear duct, 

- Spoor/Eggermont (l971 s fig. 2), on the "masking** of whole 
nerve action potentials. 

A shift corresponding to our positive MMF shift for low sound 
levels can be found in Finck ( 1966 : fig. 5 ) on the suppress- 
ion of slow gross potentials. In addition, a negative MMF 
shift is measured by Zwislocki et al. ( 1968 : fig. 9) in psy- 
cho-acoustical experiments on contralateral (*central**) mask- 
ing. 

These data bring us to a first possible explanation of the 
MMF shifts. Suppose that the maximum of the cochlear excitat- 
ion pattern moves toward the stapes when stimulus intensity 
increases. Such a shift does not contradict the results of 
e.g. Honrubia/¥ard (1958), Spoor /Eggermont (l97l) and 
Zwislocki et al. (1958). Assume further that maximiun masking 
occurs when the tops of probe and masker excitation coincide. 
A negative MMF shift, increasing with stimulus level, can 
then be expected. 

If the relation between probe and masker amplitude were lin- 
ear, either a constant negative MMF shift or no shift at all 
would pccur. But at maximum masking the relation between 
probe and masker amplitude is not a linear one. The ratio 

increases from 20 dB at L =30 dB SPL to 30 dB at L =100 

m m 

dB SPL, as can be deduced from the extremes of fig. 2 and 3- 
A shift of, say, 30 to 30 Hz per 10 dB level increment and 
coming into operation at 30 to JO dB SPL could quantitative- 
ly explain a MMF shift of -60 up to -I 30 Hz at high levels. 
The positive MMF shift, however, found at low levels, cannot 
be accounted for by such a shift of the cochlear excitation 
pattern. To that end we need the additional assumption that, 
increasing the intensity from low levels onward, the excitat- 
ion first shifts toward the apex. This seems too far-fetched 
an assumption, at least for the moment. 

A second underlying process of MMF shifts may be related to 
a changing slope of the excitation pattern. In the fore- 
going explanation we concentrated on one **point»* of the ex- 




153 



Vogtens PURE-TONE MASKING 

citation pattern only, viz the top. The detection of the 
probe may involve a certain finite bandwidth, however. Then, 
owing to an increasing asymmetry of the excitation pattern, 
we might expect the masking to be maximum at a frequency lo- 
wer than the probe frequency. So again a negative MMF shift 
will occur. 

Assume for a moment that the probe threshold is determined 

only by the amplitude ratio of probe + masker excitation to 

masker excitation and that this ratio is integrated over a 

certain bandwidth symmetrically situated around the probe 

frequency. Minimizing this integrated ratio as a function of 

the masker frequency will provide a theoretical value of the 

MMF. Preliminary calculations for f =1 kHz have shown that 

P 

for flank slopes of 1/5 and 1/30 dB/Hz (high levels) an in- 
tegration bandwidth of 160 Hz leads to a predicted MMF shift 
of about -60 Hz. The main parameters affecting the magnitude 
of the predicted MMF shift are the bandwidth and the ratio 
of the flank slopes. In principle, the negative MMF shift 
can thus be explained by an increasing asymmetry of the ex- 
citation pattern for high levels. Quantitatively, however, 
we need an integration bandwidth of at least l60 Hz in order 
to account for shifts of about -50 Hz. 

To explain the positive MMF shift, this changing slope of the 
excitation pattern seems not very useful. For a shift of, 
say, +60 Hz we need a strong and false asymmetry of the ex- 
citation and, moreover, an integration bandwidth of at least 
about 200 Hz . 

Summarizing, the negative MMF shift at high levels can be ex- 
plained in two ways, based on a top shift and/or on an asym- 
metry of the cochlear excitation pattern. For the positive 
MMF shift, however, neither of them appears adequate. Perhaps 
at these low levels quite other mechanisms play a role in the 
masking process. 

A first possibility is that, although in itself inaudible, a 
combination tone contributes to the audibility of the probe. 
Then, at e.g. f^ = 940 Hz the masker level required to mask 




154 



Vogten: PURE-TONE MASKING 

the probe, has to be higher than at f =s 1060 Hz. This would 

m 

be in agreement with the results of fig. 7« An argument 
against this assumption is, however, that such an asymmetry 
as found for a 6 ms probe, should become more prominent for 
increasing levels, whereas from our measurements only the ve- 
ry contrary appears. 

As a second possibility the positive MMF shift might be traced 
to the same process as that underlying two-tone suppression. 
Results of Sachs/Kiang ( 1968 : fig. 3 and 6 ) show that a se- 
cond tone ("masker”) has its maximum suppressing effect upon 
the spike rate when the ”masker” frequency is higher than 
the frequency of the first tone (”probe”) tuned at the best 
frequency of the fibre. This similarity, of course, only 
shifts the question about the origin of the positive MMF shift 
to a question about the origin of the asymmetry in two-tone 
suppression . 

7 • Conclusions 

1. Using a phase-locking technique, our pure-tone masking 
experiments proved that only for medium levels maximum 
masking occurs at equal probe and masker frequency. 

2. For high levels a pure tone produces maximum masking at 
a frequency significantly lower than the frequency of 
the probe. This “negative MMF shift" can be attributed 
to a top shift and/or to an increasing slope asymmetry 
of the cochlear excitation pattern. 

3» For low levels maximum masking is produced at a fre- 
quency above the probe frequency. This "positive MMF 
shift" amounts to about 60 Hz for a 1 kHz-probe . 

4. At 1 kHz both the positive and the negative MMF shift 
are independent of probe duration. 

3» Positive MMF shifts also occur for lower and for higher 
probe frequencies. Whether negative MMF shifts can be 
found for other probe frequencies than 1 kHz remains to 
be investigated. 




155 



Vogten: PURE-TONE MASKING 



8 . References 

Bos, C.E., de Boer, E, ( 1966 ) Masking and Discrimination, 

J. Acoust. Soc . Amer. 39.t 708-715 

Chistovich, L.A.V. (1971) Auditory Processing of Speech 

Stimuli - Evidences from Psychoacoustics and Neurophys- 
iology, Proc. 7th Int • Congress on Acoustics, Budapest 
1971 , Vol. I, 27-^1 (21 G 1) 

Egan, J., Hake, P. (1950) On the Masking Pattern of a Sim- 
ple Auditory Stimulus, J. Acoust. Soc. Amer. 22 , 622-630 
Finck, A. ( 1966 ) Physiological Correlate of Tonal Masking, 
J. Acoust. Soc. Amer. 39., IO 56 -IO 62 
Greenwood, D. (1971) Aural Combination Tones and Auditory 
Masking, J. Acoust. Soc. Amer. 5.0, 502-5^3 
Honrubia, V., Ward, P. ( 1968 ) Longitudinal Distribution of 
the Cochlear Microphonics inside the Cochlear Duct 
(guinea pig), J. Acoust. Soc. Amer. 44, 951-958 
Leshowitz, B. , Raab , D.H. ( 1967 ) Effects of Signal Durat- 
ion on the Detection of Sinusoids added to Continuous 
Pedestals, J. Acoust. Soc. Amer. 4l , 489-496 
Raab, D.H., Osman, E. , Rich, E. ( 1 963 ) Effect of Waveform 
Correlation and Signal Duration on Detection of Noise 
Bursts in Continuous Noise, J. Acoust. Soc. Amer. 35 , 
1942-1946 

Rose, J.E., Hind, J.E., Anderson, D.J., Brugge , J.F. (l97l) 
Some Effects of Stimulus Intensity on Response of Audit- 
ory Nerve Fibers in the Squirrel Monkey, Journ. Neuro- 
physiol. 2h.t 685-699 

Sachs, N.B., Kiang, N.Y. ( 1968 ) Two-Tone Inhibition in 
Auditory Nerve Fibers, J. Acoust. Soc. Amer. 43 , 

1 120-1 128 

Small, A. ( 1959 ) Pure Tone Masking, J. Acoust. Soc. Amer. 

31 , 1619-1625 

Spoor, A., Eggermont , J.J, (l97l) Action Potentials in the 
Cochlea, Audiology, 1 0 , 340-352 
Vogten, L.L.M. (1972) Pure-Tone Masking of a Phase-Locked 
Tone Burst, I.P.O. Ann. Progr. Rep. 7^, 5-16 
Wegel, R.L. , Lane, C.E. (1924) The Auditory Masking of 
One Pure Tone by Another and its Probable Relation to 
the Dynamics of the Inner Ear, Phys . Rev. 266-283 

Zwicker, E., Feldtkeller, R. ( 1967 ) Das Ohr als Nach- 
richtenempf anger , 2nd Edition, p. 63 and 6 4 , Hirzel 
Verlag, Stuttgart I 967 

Zwislocki, J., Buining, E., Glantz, J. ( 1968 ) Frequency 
Distribution of Central Masking, J. Acoust. Soc. Amer. 

43 , 1267-1271 




156 



FEEQUENCy SELECTIVITY AND THE TONAL EESIDUE 

R.Jo RITSMA AND A. HOEKSTRA 

Institute of Audiology, University Hospital, 

Groningen, The Netherlands. 

Introduction 

The accuracy with which the repetition frequency of a band filtered 
pulse train (residue) can be determined, appears to be much higher 
for combinations of filter frequency f and repetition frequency g, 
leading to a tonal residue then for combinations of which the 
result is an atonal residue (Ritsma (1971) )• This finding 
(for f = 2 kHz) suggests that the existence region of the tonal 
residue can be determined by means of frequency-discrimination 
measurements. The available data regarding the existence region 
have been obtained using the criterion: ”Is the correct low pitch 
present?” (Ritsma (1962), Walliser (1968) ). With that aim ex- 
periment I was carried out. Besides, this gave the possibility to 
consider the importance of frequency resolution for tonality from 
a different point of view. 

It is also known that frequency-discrimination diminishes with de- 
creasing signal-to-noise ratio (Cardozo (1971 ) )• The influence of 
this variable upon a residue is considered in experiment II. 

The importance of period detection for tonality could thus be 
analysed in more detail. 

Experiment I 

The jnd in repetition frequency Ag of l/3-octave filtered periodic 
pulse trains was measured as a function of the repetition frequency 
g at various filter frequencies f. The slope of the 1/3-octave 
filter used (B & K I 613 ), was about 100 dB/octave. The pulse width 
amounted to 100 jjisec for f ^4 kHz and 50 fisec for f>4 kHz. The 
jnd was determined by means of the AX-method. Stimuli were presen- 
ted diotically at 40 dB SL through headphones (THD 49). 

The stimuli and the silent interval in between both lasted 600msec 
(for g<50 Hz 1200 msec and 6 OO msec, respectively). 




157 



Ritsma & Hoekstra: FREQUENCY SELECTIVIO?Y AND THE TONAL RESIDUE 

The subject had to indicate the sequence of the stimuli, whatever 
the subjective impression might be, by pressing a button*Visual 
feedback followed immediately. Subjects were free to choose their 
own tempo in presentation of and response to a pair of stimuli. 

The jnd in repetition frequency of periodically interrupted noise 
was measured too, using the same procedure. This stimulus was pro- 
duced by periodic interruption (duty cycle 1/3) of wide-band noise 
(4 octaves around f), followed by l/3-octave filtering at f. 

Here f/g^10 applied. 

Three subjects participated in this experiment. 

Re suits 

Results are shown in fig. 1 as the mean of 3 subjects (every point 
represents at least one threshold determination per subject). 




Fig. 1 Results from experiment I. Mean of 3 subjects. 

= boundary of the existence region of the 

tonal residue. 




158 



Ritsma & Hoekstra: FREQUENCY SELECTIVITY AND THE TONAL RESIDUE 

The values found for the jnd of pure tones (g = f)are in agreement 
with the literature (Rakowski (1971 ) )• 

It turns out that in the case of the pulse train a similar relation 
exists between Ag/g and g for all f, viz. pure tone accuracy up till 
n = f/g = 8, followed by a rapid increase in Ag/g for 8<n<20 until 
a constant value Ag/g s 0.02 is reached. This value appears to be 
independent of f, unlike the pure tone accuracy. Moreover, it equals 
Ag/g of periodically interrupted noise for g<100 Hz. The jnd of 
periodically interrupted noise increases sharply for g>100 Hz. 

At g = 800 Hz and above determination of the jnd is meaningless 

This fits the subjective sensation that the periodically 
interrupted noise (manifested as noise with a flutter super imposed) 
resembles unaffected noise ever more. 

Discussion 

Because in the case of periodically interrupted noise spectral 
information is lacking, period detection is the only possible de- 
tection criterion. This fact together with the equality of Ag/g of 
the interrupted noise and the pulse train for n>20 permits the 
results in the case of the pulse train for n>20 to be attributed 
to period detection too. The deviation between interrupted noise 
and pulse train for g>100 Hz may be caused by the lack of corre- 
lation between the noise bursts. 

In order to get more insight into the behaviour of the jnd of the 
pulse train for n<20,the results are plotted in fig. 2 as a 
function of n = f/g (= the number of the harmonic with frequency f). 
Besides, a normalization is applied so that Ag/g = 1 for g = f. 
Owing to a larger disparity among the subjects and to an asymmetri- 
cal frequency spectrum of the stimulus caused by the frequency 
characteristic of the headphone, the curve of f = 8 kHz deviates 
somewhat. Otherwise the trend is strikingly similar. A second, im- 
portant point is the fact that the shape of the curves is not in- 
fluenced by the being tonal or otherwise of the residue, in other 
words one and the same behaviour is found in and outside the 
existence region of the tonal residue. 




159 



Ritsma & Hoekstra : FREQUENCY SELECTIVITY AND THE TONAL RESIDUE 




5 50 

n = f/g 

Fig. 2 The normalized relative jnd in repetition frequency 
of a periodic pulse train as a function of the har- 
monic number n. Mean of 3 subjects. 

From the above it can be concluded that only one detection mechanism 
must be responsible for the increase of Ag/g for n>8 and that this 
mechanism must operate in the frequency domain. However, for n>20 
the information based on frequency analysis has become so slight, 
that period detection constitutes a more useful discrimination 
criterion. Thus the frequency analysing power of the hearing system 
is limited naturally. 

A relation between the existence region of the tonal residue and 
the experimental results can be found by connecting those combi- 
nations f and g that mark the boundary of the existence region in the 
the curves of fig. 1. It is remarkable that the upper boundary of 
the existence region (smallest g at a particular f for which tona- 
lity is still perceivable) coincides very well with the plateau at 
Ag/g = 0.02. This implies that, considered the above explanation 
of the experimental results, the possibility of frequency analysis 
is a necessary condition for tonality of a residue constituted of 
a bandfiltered pulse train. 





160 



Ritsma & Hoekstra : FREQUENCY SELECTIVITY AND THE TONAL RESIDUE 

In this context it is important not to confuse frequency analysis 
with identification of a distinct frequency component in the complex 
stimulus. 

The decrease in ability to discriminate upon spectral information 
for n>8 can be interpreted as a signal-to-noise problem. It is 
known that a pure tone activates a finite frequency region internally, 
with a slope of more than 100 dB/octave. As a result of this spread 
an interaction takes place between the excitations of the individual 
harmonics. This creates as it were a steady, internal noise back- 
ground, increasing with decreasing g. If this reasoning is correct, 
comparable results must be obtained for Ag/g as a function of the 
signal-to-noise ratio in case of a pulse train in external noise. 

Experiment II ♦ ) 

Using the same procedure as in experiment I and under the same con- 
ditions Ag/g of 1/3-octave filtered pulse trains was measured as 
a function of the signal-to-noise ratio with parameters g and f. 

The background noise was generated by a maximum length sequence 
generator (HP 3722A) ( (2 - l) bits of 10 psec) and led through 

the same 1/3-octave filter. The noise was presented in synchrony 
with the pulse train. In this way the noise background was the same 
for every stimulus. The noise level just masking a pulse train at 
40 dB SL was determined using the same procedure (detection thres- 
hold) and was used as a reference level. A definite signal-to- 
noise ratio was obtained by attenuating the noise by the desired 
amount of decibels with regard to this level. 

Results 

In fig. 3 results (the mean of 3 subjects) are shown for the con- 
figurations n = 1 (pure tone) en n = 10 (f = 1, 2 & 4 kHz) 

(every point represents at least one threshold determination per 
subject). For signal-to-noise ratios S/N>20 dB discrimination is 
maximal. For smaller S/N Ag/g increases for n = 1 and n = 10 in 
nearly the same way. It is remarkable that discrimination is still 

possible at S/N = 0 (detection threshold) with an accuracy of 0.03 
(see also Cardozo (1971 ) )• 

*) This experiment was carried out by Mr. W« Kronemeyer. 




161 



Ritsma & Hoekstra ; FREQUENCY SELECTIVITY AND THE TONAL RESIDUE 




Fig. 3 The relative jnd in repetition frequency of a periodic 

pulse train for various filter frequencies as a function 
of the signal-to-noise ratio with regard to the detec- 
tion threshold (mean of 3 subjects). theoretical 

predictions. 



Fig. 4 shows results at f = 2 kHz for g = 200, '150, 138, 100 & 40 
Hz from one subject. As expected the curves for g = 100 Hz and 
g = 40 Hz coincide. Curves for g = 200 Hz and g = 150 Hz show a 
similar rising trend, but the curve for g = I 38 Hz is irregular and 
deviates strongly. The curves of the other subjects behaved like- 
wise. Here the subjective sensations of the subjects are important. 
It appeared that a transition region exists (roughly 5 <S/N<9 ) in 
which the sensation ”noise with a clear contrasting signal” changes 
into a sensation of ^coloured noise”. 

Finally it should be mentioned that measurements with octave band 
noise yielded exactly the same results for n = 10. 




162 



Ritsma & Hoekstra : FREQUENCY SELECTIVITY AND THE TONAL RESIDUE 




Fig. 4 Results from one subject for various n at 

f = 2 kHz; hatched area = transition region 
(see text). — — = theoretical predictions. 



Discussion 



The results from the experiments I and II can be related as was 
postulated before. Assuming a symmetrical excitation pattern for a 
pure tone, the peak/valley ratio (P/V) in the excitation pattern of 
a pulse train for component n can be calculated from the following 
formulas 



/o 






y- 



Parameter is S = slope of the excitation pattern in dB/octave. 

Based on the data of fig. 2 a normalized value of Ag/g can be assig- 
ned to every n. The relation holding for S<n<20 is extrapolated 
for n>20. A family of curves Ag/g as a function of P/V with para- 
meter S results from this. The form of these curves hardly depends 
on S, only the situation relative to the P/V-axis changes. However, 
we cannot fix on a reference point representative for the detection 
threshold without more ado. 




163 



Ritsma & Hoekstra- FREQUENCY SELECTIVITY AND THE TONAL RESIDUE 



Choosing a value of I50 dB/octave for S (Maiwald (l 967)1 Ritsma 
(1968) ), a good fit to the experimental results is obtained for 
P/V = 1.5 dB at the detection threshold (fig. 3 & Although there 
are small deviations, the agreement between theory and experiment in 
general is evident. 

For n2il4, however, a strong deviation of the prediction appears. 

In order to facilitate the explanation of this discrepancy the re- 
gion in which a transition of subjective sensation occurs, is marked 
by hatching (fig. 4). For S/N<3 dB discrimination has to be done on 
the colouration of the noise. Consistent discrimination is only 
possible if the signal can be sufficiently analysed spectrally. 

For S/N> 9 dB the signal is perceived in clear contrast to the noise • 
It can be postulated now that the pulse train can preserve its 
identity only as long as its period is not completely disturbed by 
the noise. This means that the curves for n^ 20 should have S/N = 9 dB 
as an asymptote. Measurements seem to justify this conclusion. Un- 
fortunately, loudness differences in stimuli with large Ag prevent 
up till now a precise affirmation. 

To conclude, it can be stated that for a residue a reliable estima- 
tion of the period as well as the possibility of frequency analysis 
are essential conditions. Since the pure tone may be considered as 
a limiting case, the same conclusion is valid here. 

Literature 

Cardozo, B,L, (1971); Frequency Discrimination of Short Sinusoids as 
a Function of Signal-to-Noise Ratio, IPO Annual Progress Report no. 6, 
Maiwald, D, (1967); Beziehungen zwischen SchalLspektrum, MithorschweOe 
und der Erregung des Gehors. Acustica I8, 69. 

Rakowski, A. (1971 ); Pitch Discrimination at the Threshold of Hearing. 
Proc, 7th ICA, Budapest, 20H6. 

Ritsma, R.J. (1962); Existence Region of the Tonal Residue I, 

JASA 34, 1224. 

Ritsma, R.J. (1968); On the Response Characteristics of the Ear. 

IPO Annual Report no, 3. 

Rit sma , R.J, (1971 ); Psychological Correlates of a Frequency Shift, 
Proc, 7th ICA, Budapest, 19HI6, 

Walliser, K, (1968); Zusammenwirken von Hullkurvenperiode und Tonheit 
bei der Bildung der Periodentonhohe. Thesis, Munchen. 




164 



FREQUENCY DISCRIMINATION AT THE THRESHOLD 
B.L. CARDOZO 

Institute Tor Perception Research, Eindhoven, The Netherlands 



In order to be able to hear a weak pure tone which is em- 
bedded in white noise, it is important to know its pitch. If 
the listener has the correct mental representation of the 
frequency of the sinusoid, then he will score well in a de- 
tection experiment; if not, then his performance will be at 
chance level under the same experimental conditions , as 
Greenberg and Larkin (19^8) have shown. 

This phenomenon is often attributed to the frequency se- 
lectivity of the hearing system. A review of detection exper- 
iments that were devised to get data on "frequency selectiv- 
ity" in threshold situations is to be found in Green and 
Swets ( 1966 ). Unfortunately, the subject *s mental represent- 
ation of the frequency of the stimulus which is an essential 
thing in these experiments is difficult to control. 

It would be important to know how accurate the perception 
of the frequency of a sinusoid is under these conditions. 

That is, what is the D.L. for frequency of a sinusoid at the 
masked threshold? It may be remarked that in experiments on 
frequency discrimination the subject has to concentrate on 
the pitch of a reference tone because his task is to detect 
a possible small deviation from this pitch in the test tone. 

A reasoning, applied by Zwicker (195^, 1970), Maiwald 

( 1967 ) and others might be tested in the threshold situation. 
This reasoning is based on the *postulate* that a change in 
pitch is perceived whenever the excitation function at some 
peripheral level changes by a certain amount, e.g. the equi- 
valent of 1 dB in sound level. The theoretical advantage of 
this reasoning is its parsimony: one need not hypothesize 
special detectors in the auditory system for the (change of) 
frequency, because the very steep slope of the excitation 
function will convert a small frequency change into a large 




165 



Cardozo; FREQUENCY DISCRIMINATION AT THE THRESHOLD 

change in excitation at the same place. 

It hardly needs stress that there is an alternative view 
to pitch perception: a detection of periodicity in the acoust- 
ic time signal as it is represented in the periphery of the 
neural system, that is, after the prefiltering process of the 
hydromechanical system of the cochlea. As the signal-to-noise 
ratio is lowered, the periodicity gets spoiled. Pitch percept- 
ion would amount to the excitation of a particular periodicity 
detector and could thus be explained quite naturally albeit 
qualitatively. The close relation between detection and pitch 
perception then follows logically. 

The paper will first oppose frequency discrimination and 
frequency resolution, which is known to be closely related to 
the critical band. Then a brief survey will be given of a few 
papers on frequency discrimination at poor signal-to-noise 
ratios. This will give the setting for a new experiment, 
which was performed at our institute by Mr. B. Borger van der 
Burg. At the end the results of the experiment will be dis- 
cussed and conclusions formulated. 

Frequency discrimination versus frequency resolution 

Frequency discrimination in the hearing system is the abil- 
ity to distinguish two non- simultaneous sounds which differ 
in frequency but are otherwise equal. The simplest case is 
with two pure tones. Frequency discrimination is measured 
(reciprocally) by the difference limen DoL., the just notice- 
able difference in frequency , or some other quantity re- 
presenting a frequency difference which can be perceived by 
the listener with a certain standard reliability, given a 
certain experimental procedure. A typical value for the D.L. 
is 1 Hz at a frequency of 1000 Hz. 

Frequency resolution in the hearing system is the ability 
to perceive separately each of two simultaneous pure tones 
which differ in frequency but are otherwise equal. In order 

*) In this paper no distinction will be made between D.L. 

and j . n . d . 




166 



Cardozo: FREQUENCY DISCRIMINATION AT THE THRESHOLD 

to assess experimentally what is perceived one would like to 
present a subject either with one or with two tones and have 
him respond the number of tones presented. But this simple 
experiment has no simple meaning, because beats and combinat- 
ion tones will betray the situation when the tones are close 
in frequency. A periodic pulse or some other signal with a 
spectrum consisting of a large number of adjacent harmonics 
of equal intensity are stimuli which alleviate these problems. 
Such stimuli have been used by many investigators (Cf Helm- 
holtz, 1913> Schouten, 19^0, Plomp, 19^4, Cardozo, 1967 )* 
Plomp*s comprehensive measurements indicate that in the fre- 
quency region from 200 to 5000 Hz a pure tone can be resolved 
if the frequencies of its neighbours are about 12^ to 20^ 
apart. He found somewhat smaller percentages when only two 
stimuli were used and special precaution was taken that the 
subjects did not use other cues than the pitches perceived. 
Cardozo used a special preconditioning stimulus consisting 
of all components of a periodic pulse save one. After a very 
short silent period, the full pulse spectrum was made audible 
and the subject had to adjust the level of the one 'new* com- 
ponent so, that he could just indicate its pitch. In these 
special conditions, auditory frequency resolution is enhanc- 
ed so that components differing no more than 6^ in frequency 
from the frequency of their neighbour could be resolved. 

Summing up: frequency resolution applies to simultaneous 
tones of equal intensity. Pure tones can be resolved when 
their frequencies differ by a certain percentage. This per- 
centage may be made as small as 6^ under special conditions, 
but normally it is about 12 to 20^. These latter values are 
also found for the critical band. 

Frequency discrimination near the threshold 

The many papers devoted to the subject may be devided into 
two groups. In the first group it is the absolute threshold 
which causes the deterioration of frequency discrimination. 




167 



Cardozo; FREQUENCY DISCRIMINATION AT THE THRESHOLD 

in the second group there is a masker consisting of white 
noise • 

The absolute threshold was approached by Shower and Bid- 
dulph ( 1931 ) as far down as 3 dB SL and with a stimulus con- 
sisting of a sinusoid which was slowly modulated in frequen- 
cy, they found a D.L, for frequency which was about three 
times as large as at 30 dB SL* Above this latter level, no 
appreciable improvement of frequency discrimination was 
found. Similar results were obtained by Harris (l952) with 
AX presentation of stimuli. 

More recently, Rakowski (l97l) presented measurements of 
pitch discrimination with long sinusoids at 3 dB above the 
absolute threshold of hearing. At 1000 Hz the standard devi- 
ation of 12 measurements about the mean setting was about 
0.3 Hz, averaged over sessions and subjects. Still closer 
to the absolute threshold is a report from Pollack ( 19 ^ 8 ) 
who had three subjects adjust a threshold of audibility and 
also a threshold for tonality of pure tones in quiet. At a 
frequency of 1000 Hz, he found an • atonal interval* of about 
3 dB above threshold. For one of his subjects the frequency 
increment which was detected correctly 30 ^ of the stimulus 
presentations was measured as a function of the level. At 
1000 Hz this increment was about 30 Hz at about 7 dB SL, 
which is larger than would be expected on the basis of the 
Shower and Biddulph data (9-^ Hz at 3 dB SL) the data of 
Harris (6 Hz at 3 dB SL) , or of Rakowski. The discrepancy, 
however, may have been produced by a slight shift in the ab- 
solute threshold which involves a very large change in the 
just noticeable increment in frequency. 

It is known that the absolute threshold is less well- 
defined than the masked threshold. For this reason the second 
group of papers look more promising for tracing frequency 
discrimination down the threshold. Extensive measurements are 
available from Harris ( 1966 ), who used white noise at a 
level that would just mask a 230 ms sinusoid of 43 dB SL. 
Given this noise, the level of the sinusoids was expressed 




168 



Cardozo: FREQUENCY DISCRIMINATION AT THE THRESHOLD 

in terms of a "recognition differential" which is a dB scale 
with the zero dB point corresponding to that level of the 
sinusoid which allows the subject to score 30^ correct in a 
three alternative forced choice detection experiment (3-AFC) 
Frequency discrimination was determined by Harris with a 
2-AFC. With some reduction and extrapolation the re- 

sults of Harris may be interpreted by stating that »at* the 
threshold of masking a 1000 Hz sinusoid of 230 ms duration 
is perceiyed with a D.L, of about 20 Hz. 

Another paper dealing with the discrimination of the fre- 
quency of sinusoids in white noise is by Henning (l9^7)* 
Since he presents no data on the detection performance of 
his subjects under the same condition, one cannot derive a 
D.L, for frequency ’at* the masked threshold from his paper. 
However, making reasonable assumptions, Henning concludes 
that there is no serious discrepancy between his results and 
the results of Harris quoted above. 

Experiment 

In our experiment we set out to determine frequency dis- 
crimination and detection of sinusoids with one and the same 



») The best documented method for comparing "thresholds" ob- 
tained with different experimental methods is the theory 
of signal detection. According to graphs and tables by 
Elliott ( 1964 ), a 30fo correct score (Pq = O. 30 ) in a 3 
alternative forced choice experiment is equivalent with 
a detection index d* = 0.6, which in turn corresponds 
with Pq = 0.63 in a 2-AFC under the condition that the 
subjects have no serious bias. Frequency discrimination 
was determined by Harris with a 2-AFC and the D.L. was de 
fined at Pq = 0.73 at a sound level which can be express- 
ed as 3 dB above the Pq = O .63 in a 2-AFC. If Harris had 
used Pq = 0.73 and a 2-AFC also in the detection experi- 
ments, the detection * threshold* would have come out 
higher, and consequently the level would not have been 
3 dB, but less. How much less can be borrowed from the 
slope of a very well-documented detection curve, publish- 
ed by Mulligan and Cornelius (l972). This curve is valid 
for 2-AFC detection of 173 ms sinusoids of 1200 Hz em- 
bedded in white noise. From that curve one reads a differ 
ence of about 2y dB between Pq = O .63 and Pq = 0.73* Sub- 
tracting this from 3 dB leads to the result in the text. 




169 



Cardozo; FREQUENCY DISCRIMINATION AT THE THRESHOLD 

paradigm, ¥e used a slight variation oT the conventional 
method of determining the D,L, for frequency by keeping the 
frequency difference constant during one session and vary- 
ing the signal-to-noise ratio, until a prescribed perform- 
ance Pq could be estimated with sufficient^ confidence . There 
are two reasons for doing this. 

In the first place, the difference limen for frequency as 
we have seen, trends to rise sharply when the signal level 
is lowered towards the masking level. Therefore , proceeding 
along a line of constant signal-to-noise ratio, one has an 
ill-defined intersection point, whereas along lines of con- 
stant frequency difference, one has a very narrow interval 
of S/N, which corresponds with the required threshold per- 
formance of the subject. 

In the second place the method allows one to proceed in 
exactly the same way when determining the difference limen 
for frequency and when determining the threshold for detect- 
ion. The stimuli were embedded in a continuous white noise 
in the frequency band 125 Hz to 4000 Hz. The power of the 
noise was kept constant throughout the experiments. The 
level may be indicated by stating that the noise would just 
mask a 1 kHz sinusoid of long duration with a level of 30 dB 
SL. A four alternative choice method was used both for de- 
tection of sinusoids of 1000 Hz and for discrimination of 
the frequency of such sinusoids. The observation intervals 
were not indicated by lights but in an acoustical way by 
presenting two * precursor* tone pulses in advance of the 
four observation intervals from which the subject had to 
choose. So there were 6 noise-plus-tone intervals, separat- 
ed by 5 intervals during which noise only was audible. These 
5 intervals were 700 ms throughout the experiment. The 2 
precursors always had the standard frequency, viz. 1000 Hz. 
They always had the same duration as the actual stimuli but 
in all conditions they were exactly 10 dB louder than the 
actual stimuli. These precursors proved to be very effect- 
ive cues to the subject, helping him remember what the 




170 



Cardozo: FREQUENCY DISCRIMINATION AT THE THRESHOLD 

standard frequency was, wh.at the tone soxinded like in the 
special envelope used (which will be explained later) , and 
helping him to concentrate at the right instants of time. 
Indeed, it turned out that the subjects easily picked up the 
rhythm of the total stimulus pattern from the two precursors 
and were synchronized far better than might have been pos- 
sible with lights, to even tone intervals as short as l 6 ms. 

The tone pulses were gated in a particular way with a 
pulse-width modulator. The envelope was an asymmetrical tri- 
angle. The rise time was always 2 ms with a rounded foot and 
roof. The decay was virtually linear and lasted either ms 
or 62 ms or 25^ ms. The tone pulses looked and sounded like 
a less or more sharply tuned filter excited by a single, 
sharp pulse. A general comparison of the effect of various 
envelopes on the discrimination of the frequencies of sinus- 
oids has been given by Ronken (1971)« His findings seem to 
indicate that it is possible to find, for any envelope, an 
effective duration by which frequency discrimination is made 
invariant for the shape of the envelope. However, in our ex- 
periment no such comparison between various envelopes will 
be made. ¥e therefore define the duration At of the tone 
pulses more or less arbitrarily as the time during which the 
gate is not closed. That is, the durations of the tone puls- 
es were I 6 ms, 64 ms or 256 ms. 

In the detection runs, one of the four stimulus tone 
pulses was deleted and the subject had to indicate which one. 
In the frequency discrimination runs, the subject had to in- 
dicate which one of the four stimulus tone pulses had been 
given a certain frequency increment Af above the standard 
frequency f = 1000 Hz. Af was known to the subject and was 
kept constant during an experimental session. 

Three subjects with different amounts of experience in 
this type of experiments participated. Fig. 1 a, b, c depicts 
data from one of them. The fraction correct responses Pq is 
plotted as a function of the ’’signal-above-noise level” D. 

The D scale has been gauged so as to have D = 0 correspond 




171 



Cardozo; FREQUENCY DISCRIMINATION AT THE THRESHOLD 




0 10 20 

SIGNAL-ABOVE -NOISE LEVEL 



Fig. 1 The Traction of correct responses Pq in a 

4 alternative choice experiment as a funct- 
ion of the ” signal-above-noise level" D. 

D is simply "the number of dB above" the 
detection threshold defined by Pq = 0.625* 
Straight lines are drawn to roughly indic- 
ate the psychometric curves. Every point 
is based on 50 responses, except points 
with Pq = 0.80, where runs were usually 
truncated after 25 responses. 



with Pq = 0.625 for each subject individually, and for each 
duration of the tone pulses. Although there were some bias 
effects, the corresponding value of the detection index is 
almost exactly d* = 1.1 9 which is equivalent with 80^ cor- 
rect responses in a 2-AFC. (Cf Elliott, 1964). 

Each one of the three parts of Fig. 1 shows a family of 
psychometric ® curves* that have been approximated by straight 
lines. The leftmost one refers to detection, the other lines 
refer to the frequency increments Af as indicated. The lines 




172 



Cardozo: FREQUENCY DISCRIMINATION AT THE THRESHOLD 

Tor the larger frequency increments have not been drawn in 
the figure in order to avoid a congestion near the detection 
line. The slope of the discrimination lines clearly depends 
on the frequency increment Af employed. Flat slopes go with 
small Af and vice versa. The larger frequency increments 
Af correspond to a slope that is equal to the slope of the 
detection lines. 

When comparing figures la, 1b and 1c, one sees that la is 
more similar to 1b than either of these is to 1c. If one 
would divide the Af values in 1c by a factor 4, then the 
three figures would be more or less the same. A similar trend 
is present in the other subjects. 

The performance = 0.625 in the frequency discrimination 
task defines a certain signal-above-noise level for every Af • 
This level will be denoted by D* . ' Fig. 2 shows D* as a 

function of Af for the three subjects and for each of the 
three durations. In Fig. 2 straight lines were fitted by 
eye. These lines are rough approximations for each of the 
three durations 256 ms (diamonds) , 64 ms (triangles) and 
16 ms (circles). Again 256 ms and 64 ms lines are much more 
similar than either of them is to the 1 6 ms line. 

Of course the straight sloping lines have validity only in 
a limited range of Af. In any case, D» must have a vertical 
asymptote at Af equal to the D.L. for very good signal-to- 
noise ratios (Cf Henning, 196?)* It is clear, that this 
asymptote is very difficult to determine with the present 
method, in which Af is kept constant and the signal level 
is varied in the experiments. The other asymptote is the line 
D* =0. The interesting points in Fig. 2 are the intersection 



») In this paper the convention is adopted to add a prime ( * ) 
to the symbol of a stimulus parameter to indicate the 
threshold value of that parameter in the sense that it is 
the value which allows the subject to score Pq = 0.625 in 
a 4-alternative choice method. (This does not apply, of 
course to the well-known d* , the primeness of which is not 
disputed) . A double prime ( * * ) indicates a threshold value 
under the condition that another, independent, stimulus 
parameter is also at threshold. 




173 



Cardozo: FREQUENCY DISCRIMINATION AT THE THRESHOLD 

points of the drawn sloping lines Tor D* as a function of 
Af with the horizontal line D* =0. The frequency increment 
corresponding to this point will he called Af”. It is the 
frequency difference which can just he discriminated with the 
same performance as the tones can he detected. It is of the 
order of l6 Hz for tones of 6k ms. For tone pulses of l6 ms 



UJ 

> 

LU 

_j 

LlI 

y} 

o 

Z- 

lU 

CD 

f 

< 

2 

Q, 

CO 

Q 

o 

X 

CO 

LU 

QC 

X 



dB 

15 

10 

5 

0 

- 5 

dB 

15 

10 

5 

0^ 

- 5 

dB 

15 

10 

5 

0 

-5 













— 1 


subj. BLC 
At:o 16 ms 
^ 64 ms 






X 


















0256 ms 


: 


1 




oS^ 

^ 




_o 

A 


1 

r> 0 






l: 


» 1 » 1 ■ 


i ^-►Af 

■ 1 ■ 1 ■ 1 ■ 



5 10 20 50 100 200 500 Hz 



^D' 










— 1 


subj. HvL 


: ^ 

- X 










At:o 16 ms 


- N, 

- 0 1 












o256ms 


i — ' 








_o 

A 


1 

0 0 



5 10 20 50 100 200 500 Hz 













1 


subj.BvdB 
M:o 16 ms 
A 64 ms 


: 












-- 


•V 


^ 0 ^ 








o256ms 




1 




. — 




1 



10 20 50 100 200 500 Hz 

FREQUENCY DIFFERENCE 



Fig. 2 Signal-ahove-noise level D* which allows h sub- 
ject to perform at Pq = 0.625 in a 4-alternative 
choice experiment with a frequency difference 
Af as indicated along the abscissa. For large 
Af discrimination and detection clearly require 
virtually the same signal-ahove-noise level. In 
each panel three sloping lines, corresponding to 
durations At = l6 ms, 64 ms and 256 ms have 
been drawn on the assumption that they should be 
parallel. The lines intersect B* = 0 in points 
with the frequency difference Af” , that is just 
noticeable at the masked threshold. 




174 



Cardozo: FREQUENCY DISCRIMINATION AT THE THRESHOLD 

duration, Af" = 64 Hz approximately. It is obvious that, in 
stating these results, no claims are made about their accur- 
acy. In Tact, Af" if not measured directly but derived from 
interpolations. But it is claimed, that these values have the 
correct order of magnitude. 

Comments 

1 . A comment should be made on the negative D* values in 
Fig. 2. Of course there is some spread in the measured 
values, amounting to 1 or 2 dB, but it is evident that 
there is a tendency for D* to be negative for large Af, 
except perhaps with tone pulses of 16 ms duration. A pos- 
sible explanation for this phenomenon is that in the cor- 
responding situations the subjects could make use of two 
cues: either they heard a manifestly high pitch in the 
particular stimulus interval or they heard the absence of 
the tone pulse of 1000 Hz. If it is accepted, that the 
subjects could sometimes have *a second look* at the stim- 
ulus interval, then it seems plausible that this double 
cue has slightly enhanced performance. A discrimination 
performance which was slightly better than the detection 
performance may be accounted for in this way, especially 
with the long tone pulses. 

2. Af” = about l6 Hz for long tones ( At = 0.256 and At = 
0.064 s) corresponds well with a value of the just notice- 
able frequency difference of 20 Hz derived from data of 
Harris ( 1966 ), It can be stated that the value of I 6 Hz 

is not in conflict with results of Henning ( 1967 )* The 
value Af” = about 64 Hz for At = 0.01 6 s does not allow 
comparison with data in the literature known to the 
author . 

3# A pleasant property of the quantity Af” is that it is, 
to a certain extent, independent of the particular per- 
formance level chosen. This is so because the detection 
lines and the relevant discrimination lines in Fig. 1 are 
parallel. The essential choice is that the detection thres- 




175 



Cardozo: FREQUENCY DISCRIMINATION AT THE THRESHOLD 

hold and the discrimination threshold should be delined 

at the same performance level. 

Discussion 

It is interesting to discuss the postulate used by Zwic- 
ker in some detail. The postulate states that a frequency 
change is perceived whenever, at a point along the critical 
band scale, there is a change ALe in the excitation level 
Le which is greater than a certain treshold AL*^. The post- 
ulate boils down to 

Af’ = AL»/C (l) 

with Af* the just noticeable frequency difference at a non- 
specified level, AL* the just noticeable difference in 
loudness corresponding to AL*^ and C the steeper slope of 
the excitation curve for pure tones. C = 27 dB/Critical Band. 
It is known, that C is independent of the sound level. At 
f = 1000 Hz, C = 27/160 dB/Hz. 

When applying ( 1 ) formally to the masked threshold situation 
we obtain 

AL” = about 2.7 dB, At = 6h or 26 k ms ( 2 ) 
which seems to be a plausible value that comes close to a 
result of Sherrick (1958) who found that the D.L. for the 
loudness of a noise-embedded sinusoid was about 2.3 dB at 
S/N = -15 dB. The formal substitution of Af" = I 6 Hz into 
( 1 ) presents, therefore, no particular problems. 

Conceptually the situation may be more interesting, if we 
ask ourselves to what physical bandwidth the postulate should 
be applied. Quite generally, the narrower this width, the 
longer the time needed for smooting out the fluctuations in 
the excitation due to the noise. 

In point of fact our experiment shows that for tone pulses 
of At = 6k ms and longer, a frequency shift is detedted 
equally well as the absence of the tone pulse. It is tempt- 
ing then to make the assumption that the narrowest width 
along the critical band scale which the hearing system can 
use for determining changes in excitation is equivalent to 




176 



Cardozo: FREQUENCY DISCRIMINATION AT THE THRESHOLD 

l6 Hz. ¥e know moreover From (2) that in this band the addit- 
ion of* the sinusoid amounts to a doubling, roughly, of* the 
total excitation. That is, in this band, which is about one 
tenth of the critical band, the excitation is about equal to 
that of the noise. Carrying this reasoning one step further 
the result for the tone pulses of l6 ms duration may be inter- 
preted in a similar way. Now the hearing system needs inte- 
gration over a band of frequency which is four times as wide 
as with the longer tone pulses in order to smooth the fluct- 
uations in excitation due to the noise. 

In a model of pitch perception that is based on the de- 
tection of periodicity, the present results can be accommo- 
dated easily, albeit qualitatively. In this model the out- 
puts of the cochlear filter are processed by an array of 
pitch extractors. The additional assumption is made that the 
pitch extractors have a memory of 64 ms duration. One pitch 
extractor then has a bandwidth of about 16 Hz, so that pitch 
perception in the sense of detecting a sinusoid in noise and 
detecting a frequency shift of l6 Hz amount to the same 
thing. ¥hen the signals are shorter than 64 ms, the bandwidth 
widens proportionately. 

Conclusion 

Summing up it has proved possible to determine auditory 
frequency discrimination with sinusoids ”at the masked 
threshold” whatever this threshold may be. The just notice- 
able frequency difference denoted by A f ” at this singular 
level, proves to be one order of magnitude narrower than the 
critical band. This result was found with ’’long” sinusoids 
of f = 1000 Hz. For 1000 Hz sinusoids with a duration of only 
1 6 ms another value of A f ” was derived. This value fits the 
relation Af”. At = 1, for At smaller than about 64 ms. 

References 

Cardozo, B.L. (19^7) ”0hm*s Law and Masking”. IPO Ann. Progr. 

• 2l9 59 “"^ 5 » 




177 



Cardozo: FREQUENCY DISCRIMINATION AT THE THRESHOLD 

Elliott, P.Bo ( 1964 ) "Tables of d*". In: Signal Detection and 
Recognition by Human Observers, edited by J.A. Swets* 

Greenberg, G.Z. and Larkin, ¥,D. (1968) "Frequency Response 

Characteristic of Auditory Observers Detecting Signals of 
a Single Frequency in Noise". J. Acoust. Soc. Am. hh , 

1313 - 1523. 

Greenwood, D.D. (l9^l) "Auditory Masking and the Critical 
Band". J. Acoust. Soc. Am. 484-302. 

Harris, J.D. (l932) "Pitch Discrimination". J. Acoust. Soc. 

Am. 730 - 733 . 

Harris, J.D. ( 1966 ) "Masked D.L. for Pitch Memory". J. Acoust. 
Soc. Am. 43-46. 

Helmholtz, H. von ( 1913 ) "Die Lehre von den Tonempf indungen" 
6th ed. Vieweg editors, Brauschweig, 84-112. 

Henning, G.B. (19^7) "Frequency Discrimination in Noise" 

J. Acoust. Soc. Am. 774-777. 

Maiwald, D. (l9^7) "Ein Funkt ions schema des Gehors zur Be- 

schreibung der Erkennbarkeit kleiner Frequenz- und Ampli- 
tudenanderxingen" . Acustica J_8, 81-92. 

Mulligan, B.E. and Cornelius, P.T. (l972) "Note on Psychomet- 
ric Invariance of Detection Functions". J. Acoust. Soc. 

Am. 1207 - 1208 . 

Plomp, R. ( 1964 ) "The Ear as a Frequency Analyser". J. Acoust. 
Soc. Am. 2 ^, 1628-1636. 

Pollack, I. ( 1948 ) "The Atonal Interval". J. Acoust. Soc. Am. 

20 . i46-i49. 

Rakowski, A. (l97l) "Pitch Discrimination at the Threshold of 
Hearing". Proc. 7th Int . Congress on Acoustics, Budapest 
Vol. 3, Paper 20 H 6, 373-376. 

Ronken, D.A. (l97l) "Some Effects of Bandwidth-Duration Con- 
straints on Frequency Discrimination". J. Acoust. Soc. 

Am. 1232 - 1242 . 

Schouten, J.F. (l94o) "The Residue, a New Component in Sub- 
jective Sound Analysis". Proc. Koninklijke Nederlandse 
Akademie van ¥etenschappen XLIII . 337-339* 

Sherrick, C.E. (1938) "Effect of Background Noise on the Aud- 
itory Intensive D.L.". J. Acoust. Soc. Am. ^J_, 239-242. 

Shower, E.G. and Biddulph, R. (l93l) "Differential Pitch 

Sensitivity of the Ear". J. Acoust. Soc. Am. 273-287* 

Zwicker, E. (1936) "Die Elementaren Grundlagen zur Bestimmung 
der Inf ormationskapazitat des Gehors" .Acustica 336-381 . 

Zwicker, E. (l970) "Masking and Psychological Excitation as 
Consequences of the Ear's Frequency Analysis". In: Plomp 
and Smoorenburg, Editors: Frequency Analysis and Period- 
icity Detection in Hearing, Sijthof, Leiden 376-394. 




178 



liONOTIC AND DICHOTIC PITCH MATCHINGS WITH COMPLEX SOUNDS 
G. VAN DEN BRINK 

Dept, of Biological and Medical Physics 
Erasmus University Rotterdam, The Netherlands 

1 ) Introduction 

In previous publications (1969, 1971, 1972) diplacusis 
and pitch experiments with monotic presentation of pure 
tones and three-component complex sounds have been described. 

Briefly summarizing the results, it has been found that: 

a) Binaural diplacusis, due to slight differences of the 
frequency-pitch relations for pure tones in a person's two 
ears, exists for everybody. Binaural matching with alternat- 
ing tone pulses as a function of the frequency results in 
irregular patterns, with maxima and minima up to a few per- 
cents for "normal" subjects. 

b) Auditory threshold curves as well as isophones show a fine 
structure which is correlated with diplacusis, suggesting 

a common cause of both phenomena. It is not known yet what 
the cause of these phenomena is; one can imagine that they 
are due to mechanical i rregul ari ti es in the organ of Corti , 
for example spatial i rregul ari ti es in the mechanical proper 
ties of the basilar membrane on top of a smooth increase of 
width and mass from base to apex (1969). 

c) Pure tone diplacusis is predictive for diplacusis with com- 
plex (residue) sounds. This has been found by using har- 
monic amplitude modulated signals with relatively low rati- 
os between carrier and modulation frequencies (not beyond 
8). There is a convincing agreement between diplacusis for 
AM signals and the average pure tone diplacusis for the 
carrier and side bands, measured as a function of the frequ 
ency (1971). 

d) Also in the case of auditory fatigue, pure tone diplacusis 
for the spectral components is likewise predictive for 
diplacusis for complex sounds (1972). 

Recently, these findings have been confirmed by two other 
subjects. From the results it has been concluded, that the 




179 



van den Brink: MONOTIC AND DICHOTIC PITCH MATCHING 

residue pitch of a complex sound is determined by the combined 
neural activity as it is being elicited by its spectral com- 
ponents separately. 

It seems worthwhile to mention one additional experiment, 
the results of which are in full agreement with the fatigue 
experiment mentioned under d. Two subjects measured the pitch 
shift of a three component residue sound and of its spectral 
components separately, due to the presence of low pass noise, 
as a function of its cut-off frequency (see also Terhardt 
und Fasti, 1971;and Terhardt, 1972). Under these circumstances 
too, the pure tone data turned out to be predictive for what 
happens with the pitch of a residue sound. Publication of 
these data, as well as other data that will be briefly des- 
cribed in this paper, is in preparation. 

2 ) Residue pitch for dichotically presented harmonic complex 
sounds . 

With our experimental set-up, consisting of clock gene- 
rators, dividing systems and phase-lock generators, that will 
be described in a future publication, it was possible to pres- 
ent two harmonic three-component signals (signals A and B) 
al ternati ngl y . The components of either signal could be distrib- 
uted as desired over a subject's two ears. At the right of 
Fig. 1 three examples of possible stimulus configurations 
are given. For the one indicated with I, signal A consists of 
the 4th, 5th and 6th harmonics of a common fundamental present- 
ed to the left ear; signal B consists also of the 4th, 5th 
and 6th harmonics of a common fundamental. The 5th and the 
6th harmonics are presented in the left ear too, but the 4th 
in the right ear. The other two configurations speak for them- 
sel ves . 

The subject's task was to adjust the (clock) frequency Fg of 
signal B such that the pitch of signal B was equal to that of 
signal A, as a function of the (clock) frequency F/^ of signal 
A. Because of the properties of the dividing system, Fg - F/\ 

Fa 




180 



van den Brink: MONOTIC AND DICHOTIC PITCH MATCHING 



equal s 




,k being the rank number of a harmonic. 



Before discussing the results, it might be illustrative 
to describe what actually is being heard under these circum- 
stances. Having a configuration as described, signal A, with 
all three components in the left ear, results in a clear 
residue pitch sensation which is being perceived inside the 
head, near the left ear. Signal B, having its lowest compon- 
ent in the right ear, results in a nearly equally pronounced 
residue sensation near the left ear, and simultaneously in 
a pure tone sensation near the right ear. This pure tone has 
a pitch that corresponds with the frequency of the component 
that is presented to the right ear. The timbre of the sound 
that is heard in the left ear is much more like that of a 
three-component signal than that of a two-component signal. 
This indicates already that the component that is presented 
separately in the right ear does contri bute to the resi- 



due sensation near the left ear. A stronger argument that 
such an interaction takes place is that the same thing hap- 
pens when the middle component is presented to the right ear. 
We then have the 4th and the 6th harmonic of a common fundam- 
ental, being the 2nd and the 3rd harmonic of a common fundam- 
ental which is an octave higher, in the left ear. Yet, the 
residue pitch perceived in this case corresponds to the pitch 
of a pure tone with a frequency which is equal to that of the 
common fundamental of the 4th, 5th and 6th harmonic. Switching 
off the 5th harmonic in the right ear or tilting the right 
earphone results in an octave jump of the sound perceived near 
the left ear. Since this phenomenon occurs also at very Ipw 
intensity levels, it cannot be due to cross talk between the 
ears, so that it must be concluded that interaction of neural 
activity descending from both ears takes place. 

It turned out not to be difficult to concentrate one's 
attention to the sound that is being perceived at the left 
only, disregarding the pure tone sensation at the right. 

For either of the three cases, with the 4th, the 5th or the 
6th harmonic of signal B presented to the right ear, residue 




181 



van den Brink: MONOTIC AND DICHOTIC PITCH MATCHING 




Fig. 1. Residue pitch matchings of monotically and dichotically 
presented complex sounds with signal B in its entirity. 

pitch matchings have been made as a function of the frequency. 
In Fig. 1 the results for one subject have been plotted as a 
function of the frequency of the component of signal A, that 
has the same rank number as that of signal B which is present- 
ed to the right ear (top three curves in Fig. 1, f^^, f^^ and 
f^g , respecti vely ) . 

All three curves can be interpreted as being the result 
of pure tone binaural diplacusis measurements, except that two 
more harmonics were presented simultaneously in the left ear, 
so that a residue pitch instead of a pure tone pitch is being 
perceived. Because earlier observations showed that for rank 
numbers, as used here, diplacusis of a residue equals approxi- 
mately the average pure tone diplacusis for the components 
separately, the curves I, II and III represent equally weighed 
diplacusis curves. If no other factors are involved, which 
seems not to be the case for these relatively low rank numbers, 
the sum of the weights has to be one. This means that summing 
of the curves I, II and III has to result in a curve that is 
similar to a pure tone diplacusis curve. The fact that this. 




182 



van den Brink: MONOTIC AND DICHOTIC PITCH MATCHING 

indeed, turns out to be the case (compare curve IV with curve 
V) indicates, that a spectral component contributes with equal 
weight to a residue pitch, regardless of the fact to which 
one of the ears it is being presented. 

It can be concluded, therefore, that our earlier conclusion 
that residue pitch is determined by the combined neural activ- 
ity as elicited by the spectral components separately, also 
holds in the case of dichotic stimulation. 

3 ) Dichotic residue pitch for nonharmonic complex sounds . 

The fact that a residue pitch of a complex sound can be 
perceived with dichotic stimulation as well, makes it possible 
to stimulate with non-harmonic complexes, without the usual 
consequences due to physical i nterf erence , causing beats and 
roughness or falseness of the sound. As long as that part of 
a sound which is presented to one ear remains harmonic, it is, 
within limits, allowed to add components to the. other ear, 
which are not strictly harmonic with regard to the component(s) 
in the first ear. 

This is demonstrated in the next experiment. The stimulus 
configuration (see Fig. 2 ) was again such that signal A con- 
sisted of three harmonics (4th, 5th and 6th) presented in the 
left ear. Signal B consisted of three harmonics with the same 
ranknumbers ; the lowest two having exactly the same frequen- 
cies as the lowest two components of signal A. These two com- 
ponents were also presented to the left ear. The highest com- 
ponent (6th harmonic) of signal B, however, was unlocked and 
was presented to the right ear. The configuration is shown in 
the top half of Fig. 2. The frequency of the highest component 
of signal B could be varied independently and, thus, could be 
non-harmonic. The frequency of this component was adjusted in 
such a wal' that the pitches of the two signals were equal. 

If only the spectral components of the external stimulus contri- 
bute to the residue pitch and no other factors are involved, 
the contribution of component A6 to the pitch of signal A must 
be matched with the contribution of component B6 to the pitch 
of signal B in order to have equal pitches for both signals. 

This turns out to be the case for rank numbers 4,5 and 6 of 




183 



van den Brink: MONOTIC AND DICHOTIC PITCH MATCHING 




Fig. 2. Like Fig. 1. 
but with only one 
variable component. 



the harmonics. Dichotic non-harmonic residue matchings are 
equal to pure tone matchings. This indicates again that spectr 
al components contribute to a residue pitch with equal weights 
in the case of dichotic presentation as they do in the case of 
monotic presentation, at least for relatively low rank numbers 
For higher rank numbers (see Fig. 2b), however, the neces- 
ary shift of the component in question, in order to match the 
pitches, is systemati cal ly higher than for pure tones. Although 
it can not be excluded that this deteriorated fit is due to a 
smaller weight of the direct contribution of component BIO to 
the pitch of signal B, we are inclined to assume other factors 
to be involved. An argument is that also in the case of mono- 
tic experiments the fit between measured binaural diplacusis 
for residue sounds and the data, calculated from pure tone di- 
placusis , deteriorates with increasing rank number. Possible 
factors might be: a) an increasing overlap of the excitation 
patterns because of the increase of the critical band with re- 
gard to the spectral separation of the components; 




184 



van den Brink: MONOTIC AND DICHOTIC PITCH MATCHING 

b) the necessity of choosing other weight combinations when 
the components are moving away from the region of spectral 
dominance; c) the possibility of an increased contribution 
of combination tones, which do have smaller amplitudes, but 
which are closer to the region of spectral dominance. 

The influence of the rank number of the harmonics is 
convincingly demonstrated in Fig. 3. In this experiment the 
frequency of the highest component of signal A was kept con- 




0 5 10 15 20 

Fig. 3. Residue pitch matching with one variable component 

for two frequencies as a function of the rank number 
of that component. 

stant, whereas the rank numbers of the components increased. 
For rank numbers of the highest component beyond 8, the fre- 
quency of the highest component of a dichotically presented 
three-component signal has to be shifted more with increasing 
rank number, in order to match the pitches. These results 
demonstrate that our previously mentioned conclusion is also 
valid for signals that are not strictly harmonic, provided 
that the rank numbers are sufficiently low and the non- 
harmonic part of the stimulus is being presented in the oppo- 
site ear so that no physical interference can take place in 
the peripheral hearing organ. 




185 



van den Brink: MONOTIC AND DICHOTIC PITCH MATCHING 

4) Dichotic fusion . 

Since it is apparently allowed to mix components which mutu- 
ally have a certain degree of non-harmoni ci ty , as long as mutu- 
ally non-harmonic components are not presented in the same ear, 
it seemed of interest to have another look at dichotic fusion. 

The upper and lower fusion limits for the frequency of a pure 
tone simultaneously presented in the right ear have been deter- 
mined as a function of the frequency of a pure tone presented 
in the left ear. 




Fig. 4. The width of the frequency range for dichotic pure tone 
fusion as a function of the frequency. 

In Fig. 4 the width of the fusion region is plotted as a 
function of the frequency. The results are in very good agree- 
ment with those of Odenthal (1963). More interesting, however, 
is the fact that, within the limits of measuring accuracy, the 
fusion bands are symmetrical with respect to'the situation of 
equal pitch and thus to the value of binaural diplacusis in 
the way we measured it. The diagram in Fig. 5 demonstrates 
this for a number of frequencies. Binaural diplacusis as well 
as the upper and lower fusion limits have been determined. 

The centre frequency of the fusion band has been plotted hori- 
zontally and the diplacusis value vertically. The solid line 




186 



van den Brink: MONOTIC AND DICHOTIC PITCH MATCHING 

represents complete correlation. The scatter of the points 
must be mainly due to the fusion measurements which are less 
accurate than the diplacusis matchings. The correlation, 
however, is indisputable. Some residue pitch fusion experiments 



% 




Fig. 5. The centre frequen- 
cy of the dichotic fusion 
range plotted against the 
value of diplacusis for a 
number of frequencies. 



were carried out too. As far as can be concluded at the moment, 
there is hardly any difference in the width of the fusion band 
when compared with pure tone results. Symmetry occurs also 
with regard to diplacusis. 

The results imply that also in the case of dichotic fusion, 
the neural excitation patterns, elicited by information des- 
cended from both ears and by themselves determining for the 
separate pitches, superimpose. This results in an excitation 
pattern which determines the pitch of the whole, dichotically 
presented stimulus. In other words, dichotic fusion is not a 
matter of frequency fusion, but of pitch fusion. Therefore, 
in the case of binaural stimulation, binaural diplacusis can 
never be noticed as long as it does not exceed the limits of 
pitch fusion. 




187 



van den Brink: MONOTIC AND DICHOTIC PITCH MATCHING 

C onci us i ons . 

Provided that the rank numbers of the spectral components 
of a complex sound are chosen sufficiently low 8), the neural 
excitation pattern which determines the residue pitch of the 
sound is the result of superposition of neural excitation pat- 
terns as elicited by the spectral components which by them- 
selves would determine their pure tone pitches in the case 
that they were presented separately. With regard to pitch this 
rule holds also in the case of dichotic stimulation, when part 
of the information descends from one ear and the rest from the 
other ear. As long as physical interference in the peripheral 
organs is avoided, the components do not even have to be strictly 
mutually harmonic in order to have an unambiguous pitch sensation. 
This conclusion is being confirmed by the fact that the frequency 
range, within which dichotic pitch fusion occurs, appears to be 
symmetrical with regard to left-right equal pitch conditions 
(i.e. binaural diplacusis). 

For harmonics with rank numbers not beyond 8, the separate 
excitation patterns superimpose approximately with equal weights. 
For higher rank numbers this simple rule does not hold any more. 
Other factors, such as mutual overlap of the separate excitation 
patterns, spectral dominance, and the existence of combination 
tones must then be considered. 

References 

Van den Brink, G. (1969) Experiments on binaural diplacusis and 

tone perception. In: frequency analysis 
and periodicity detection in hearing. 

Ed. R. Plomp and G.F. Smoorenburg, 

A.W. Sythoff, Leiden, 1970. 

Van den Brink, G. (1971) Two experiments on pitch perception. 

J. Acoust. Soc. Am. 48, 1355-1365 
Van den Brink, G. (1972) The influence of fatigue upon the pitch 

of pure tones and complex sounds. 

Proc. Symposium on Hearing Theory, 

I . P . 0 . , Ei ndhoven . 




188 



van den Brink: MONOTIC AND DICHOTIC PITCH MATCHING 

Odenthal , D.W. (1963) Perception and neural representation of 

simultaneous dichotic pure tone stimuli, 
Acta Physiol. Pharmacol. Neerl . 12, 
453-496. 

Terhardt, E. und H. Fasti, 

(1971) Zum Einflusz von Stdrtdnen und Stdrge- 

rauschen auf die Tonhdhe von Sinustdnen, 
Acustica 25, 53-61 

Terhardt, E., (1972) Zur Tonhdhenwahrnehmung von Klangen I 

Psychoakustische Grundlagen, Acustica 
26, 173-186. 




189 



COMMENTS ON: "Monotic and Dichotic Pitch Matchings with Complex Sounds" 

(G. VAN DEN BRINK) 

E. TERHARDT 

Institut fur Elektroakustik der Technischen Universitat, Munchen, FRG 

It may be of interest to note that the maximal frequency distances for 
dichotic fusion and monoti c fusion of two simultaneous pure tones, 
respectively, coincide well (Fig. 4 of van den Brink's paper; Thurlow and 
Bernstein, 1957; Plomp, 1964; Terhardt, 1968). This is another indication 
of the close cooperation of both ears in pitch perception. In addition 
to the relations between virtual pitch of complex tones and spectral 
pitch of their harmonics which have been demonstrated so excellently by 
van den Brink in his dichotic experiments, this issue supports the idea 
that 

(1) spectral resolution of harmonics is an essential condition for the 
existence of virtual pitch (Terhardt, 1970), 

(2) virtual pitch is derived from the spectral pitches (in contrast to 
periodicities or frequencies) of dominant (i.e. aurally resolvable) 
harmonics (Terhardt, 1970; 1972a, b), and 

(3) virtual pitch is the product of a perceptual process of much higher 
complexity (i.e. of a more "central" process) than is the case for 
spectral pitch (Walliser, 1969; Terhardt, 1970; 1972a, b; Houtsma 
and Goldstein, 1972; Wilson, 1974; Goldstein, 1974; Terhardt, 1974). 

The new findings of van den Brink fit well into these principles, as 
formerly did the results of Houtsma and Goldstein. Van den Brink's 
results appear even more cogent since they are concerned with pi tch 
perception instead of musical interval recognition . 

REFERENCES 

Goldstein, J.L. (1974). "An optimum processor theory for the central 

formation of the pitch of complex tones," J. Acoust. Soc. Amer. 54 , 
1496-1516. 

Houtsma, A.J.M., and Goldstein, J.L. (1972). "The central origin of the 
pitch of complex tones: Evidence from musical interval recognition," 
J. Acoust. Soc. Amer. 520-529. 

Plomp, R. (1964). "The ear as a frequency analyzer," J. Acoust. Soc. Amer. 
1628-1636. 

Terhardt, E. (1968). "Ober die durch amplitudenmodulierte Sinustone 
hervorgerufene Horempfindung," Acustica 20, 210-214. 

Terhardt, E. (1970). "Frequency analysis and periodicity detection in the 
sensations of roughness and periodicity pitch," in "Frequency 




190 



COMMENTS 



analysis and periodicity detection in hearing," R.PIomp, and G.F. 
Smoorenburg (Eds.)> Sijthoff, Leiden, pp. 278-287. 

Terhardt, E. (1972a). "Zur Tonhohenwahrnehmung von Klangen. I. Psycho- 
akustische Grundlagen," Acustica 26, 173-186. 

Terhardt, E. (1972b). "Zur Tonhohenwahrnehmung von Klangen. II. Ein 
Funktionsschema," Acustica 26, 187-199> 

Terhardt, E. (1974). "Pitch, consonance, and harmony," J. Acoust. Soc. 

Amer. 55, No. 4 (in press). 

Thurlow, W.17, and Bernstein, S. (1957). "Simultaneous two-tone pitch 
discrimination," J. Acoust. Soc. Amer. 515. 

Walliser, K. (1969). "Ober ein Funktionsschema fur die Bildung der Period- 
entonhohe aus dem Schallreiz," Kybernetik 6, 65-72. 

Wilson, J.P. (1974). "Psychoacoustical and neurophysiological aspects 
of auditory pattern recognition," in "The Neurosciences: Third 
Study Program," MIT Press, pp. 147-153. 




IV. Auditory Time Analysis 




193 



RECXJHDINGS FROM SPIRAL GANGLION NEURDISIES 
L.U.E. KOHLLt^FFEL 

I, Physiologisches Institut der Uhiversitat Erlangen-Niirriberg, 

Erlangen, FRG 

IISITRODUCriC^J ; Owing to its location in Rosenthal's canal the spiral 
ganglion in the cat can hardly be considered a place suitable for 
conprehensive surveys of the cochlear neural output, a circumstance 
^lich may largely explain its past neglect by neurophysiologists. 
Mthou^ it defies easy access it is this very location v;hich turns into 
a most attractive feature \«^en we conceive of the benefits for intra- 
oochlear experimentation afforded by the possibility of direct visual 
control over related areas of the spiral ganglion and of the basilar 
membrane. It is this aspect v^ich primarily provided the stimulus for 
the present study. 

In exploring neurone behaviour in the spiral ganglion it is important 
to observe as closely as possible standards developed for work on 
auditory nerve fibres so that the well established fibre discharge 
characteristics (Kiang et al. , 1965) can be called upon as a reference 
against which to contrast neuronal discharges. It is indeed reassuring 
to find that the firing patterns described for fibres also seem to 
prevail in recordings fran spiral ganglion neurones. On this account 
we may be led to acknowledge the feasibility of the approach to the 
ganglion if it were not for some units vAilch clearly deviate in sane 
aspects of their response patterns fron the norm set by fibre studies. 

It is the purpose of this paper to report these interesting dis- 
crepancies and to draw attention to the relation between mit 
characteristics and unit location along the basilar membrane. - A 
more complete accoimt of the investigation has been given else^ere 
(Kohlloffel, 1974). 

METB3PS ; Unaesthetized cats (Dial-Urethane, 0.75 mg/kg) were placed in 
a double -walled, electrically -shielded chamber. After reroval of the 
round window membrane a small region of the spiral ganglion in the 
basal turn approxiimately 0.6 to 0.8 mim in length was ejqDOsed for the 
insertion of indium-filled, platinum-tipped microelectrodes (Do^«ben and 




194 



KohllSffel: RECX)RDINGS FBCM SPIRAL GANGLION NEURONES 

Rose, 1953) with tip diaiteters of 4 to 8^™. Acoustic stiimLi were 
delivered in the usual manner with a closed system consisting of a half- 
inch condenser earphone (Briiel & Kjaer 4134) and a sensing quarter- inch 
condenser microphone (Briiel & Kjaer 4136) as monitor for the sound pressure 
at the eardrum. The state of preparations was checked with the threstold of 
the N^ response to lOO^usec clicks as criterion. A wire electrode placed 
in the round window region in close jiaxtaposition to the ndcroelectrode 
proper was used to record this cochlear gross response. In order to cancel 
the gross potentials picked up by the low^iirpedance-ndcroelectrode the 
output of the wire electrode was s\±>tracted from the microelectrode out- 
put with a differential aitplifier. - On and off-line processing of spike 
activity was done in the way described by Kiang et al. (1965) and tuning 
curves were obtained with an automated procedure (Kiang, Moxon and Levine, 
1970) . 

RESULTS ; With the slender rate of 1 to 18 imits per preparation extra- 
cellular recordings from 160 units were collected from 28 different cats 
with satisfactory response thresholds. Usually holding on to units was 
no problem and the observation of some of the units lasted as long as six 
hours. Unit activity was found to be of the ”all-or-none" type and Fig.l 
displays the two types of spike waveforms encountered. Vfhlle approximately 
65% of the units showed the "positive” spike in Fig. lA ^approximately 
35% of the units showed the "negative" spike in Fig. IB. There appeared 
to be little chance to effect a change from one type of spike to the other 
for individual units by moving the microelectrode. It may be mentioned 
that the "positive" spike resembles closely in shape the ones reported by 
Svaetichin (1951) for the spinal ganglion cells in the frog. 



A B 




spike (imit 57-3) . Positive polarity is upwards; 
bandwidth of recording system is 80 Hz to 10 kHz. 



Imsec 




195 



Kohlloffel: BECOWm^ FBCM SPIRAL GANGLIOJ NEUIOJES 



TONE 

DURATION: 20msec 
PST INT 

HISTOGRAMS HISTOGRAMS 



bursts: 2lkHz 

DURATION: SOmsec 
PST INT 

HISTOGRAMS HISTOGRAMS 



UNIT52-I6 
-25 dB 



loo: 



loo; 








UNIT 52-I8 
-25 dB 




0 40msec 6 lOmsec 0 lOOmsec 




TUNING CURVES 




FREQUENCY IN 

Fig. 2. Responses from two neurones in the same 
preparation. IMit 52-16 "negative" spike wave- 
form and 2.62 nm frcxn basal end of basilar 
membrane. Unit 52-18 "positive" spike waveform 
and 2,35 irm from basal end of basilar msttorane. 
Stimuli: levels in dB re 200 V p-p into ear- 
phone for tone bursts and tuning curves and re 
100 V into earphone for rarefaction clicks. 

Tone bursts at 21 kHz, duration 20 msec and 
50 msec, repetition rate 10 per sec, 2.5 msec 
rise and fall time. Zero time of PST histogram 
of tone burst responses is 2.5 msec prior to 
the onset of the electric stimulus to the ear- 
phone. Rarefaction clicks 100 , usee duration, 

10 per sec repetition rate. Length of runs: 
for tone bursts 1 min, for clicks 8 min 
(unit 52-16) and 5 min (unit 52-18) . 



kHz 



RAREFACTION CLICKS 
PST HISTOGRAMS 



UNIT 52-16 




4 msec 




196 



Kohlloffel: RECORDINGS FKM SPIRAL GANGLIC8^ NEURCmS 

Fig. 2 exenplifies the patterns of response for two units from the sane 
preparation. Leaving aside for the nonent the interesting differences 
between units evident in the tone burst responses we may first discuss the 
features vMch are in line with fibre responses. Thus we can see that the 
tuning curves ( here uncorrected for the frequency response of the 
acoustic system and the middle^ar) timmed out very similar to those 
reported for fibres with high characteristic frequency (CF) (Kiang, Sachs 
and Peake, 1967; Kiang, Moxon and Levine, 1970; Kiang and Moxon, 1972). 

PST histograms of click responses also followed the pattern known from 
high CF fibres (Kiang et al. , 1965) . 

It is interesting to investigate the relation between unit CF and unit 
location along the basilar mambrane. CF was estimated as the last 
recognizable dip of the high frequency part of the tuning curves. Unit 
location relative to the basal end of the basilar metibrane was estimated 
by radially projecting from the locus of the penetration in the ganglion 
to the basilar membrane. The threedimensional graph in Fig. 3 shows the 
resulting distribution of imits versus estimated CF and estimated 
distance fron the basal end of the basilar membrane. The dashed line 
in the bciseplane of the graph closely fits Schuknecht's (1960) data and 
it represents Greenwood’s function (1961) relating the frequency of 
maximum vibration to the locus on the basilar membrane. The shaded 
area is the section cut through the histogram according to this line. 

With Greenwood’s function cutting right across the bin with the 
highest unit content it is confirmed that the function can be used 
with same measure of success to predict for a given point on the 
iTOobrane in the basal turn the CF of neurones in the corresponding 
region of the ganglion. 

The latency of click responses may be submitted to similar analysis. 

In Fig. 4 the latency of the first peak in the PST histogram to rare- 
faction clicks is plotted for 23 units versus estimated CF and estimated 
unit distance from the basal end of the basilar membrane. The shaded 
area to the left represents the latency histogram versus CF, the shaded 
area in the background is the latency histogram versus distance. Both 
histograms show an average latency of about 1.2 msec. 




197 



Kbhlloffel: REXX>RDINGS FKM SPIRAL GANGLIOJ NEURONS 




Fig. 3. Distribution of units versus estimated CF and estimated 
distance frcxn basal end of basilar menbrane. The baseplane is 
divided into (twodimensional) bins with the area of 0.5 mn x 5 kHz. 

Bin content is indicated by vertical bars (e.g. one unit in bin 3.5 
to 4 mm and 12.5 to 17.5 kHz) . Dashed line in the baseplane according 
to Greenwood’s (1961) function; 

f = 418.6 

f in Hz, distance x in itm from basal end, 22 mm 

average length of cat’s basilar membrane. 

Shaded area is section cut through histogram according to dashed 
line. - Represented are 105 units from 15 cats. 

We may appreciate fron the shape of the tuning curves in Fig. 2 that the 
way stimuli were delivered to the cochlea leaves much to be desired and 
part of the scatter of points in Figs. 3 and 4 may actually result from 
misjudging the CF of seme of the units. Other ways of stimulus appli- 
cation with better high frequency performance may do much to rotiedy 
this situation and thus may improve the prospects for differentiating 
units from closely neighbouring regions in the basal turn in terms of 
CF and click response latency. While this reservation is appropriate 




198 



Kohll6ffel: REXX)RDINGS FHDM SPIRAL GANGLICM NEURCmS 




Fig. 4. Latency of first peak in PST histograms of responses to rare- 
faction clicks plotted versus estimated unit CF and estimated unit 
distance from the basal end of the basilar merttorane. (100 , usee clicks 
at O dB re 100 V into earphone; click repetition rate 10 per sec) . The 
shaded area to the left shows latency histogram versus CF: the 
arithmetic mean of latency values found in 5 kHz wide bins is plotted. 
The shaded area in the back shows latency histogram versus distance: 
the arithmetic mean of latency values found in 0.5 mm wide bins is 
plotted. The dashed line in the baseplane is drawn according to 
Greenwood's function (see Fig. 3). - Represented are 23 units from 
10 cats. 

concerning these more delicate distinctions between units it is hardly 
necessary with respect to the gross differences between units clearly 
denonstrable within the capabilities of the employed acoustic system. 

As the conparison between tone biarst response patterns of units 52-16 
and 52-18 in Fig. 2 shows there were differences between units occurring 
at levels not much above imit threshold. T#iile the PST histogram of unit 
52-16 follows the pattern described for auditory nerve fibres (Kiang et 
al . , 1965) the one of unit 52-18 does not. The shape of this histogram 
is different and there is a characteristic sharp peak and a dip at the 
beginning v^ch is followed by the flat renaining part of the histogram. 




199 



Kbhlloffel: REXX)RDINGS FBCM SPIRAL GANGLION NEUROSES 



PST 

HISTOGRAMS 



INT 

HISTOGRAMS 



8kHz 

-lOdB 



25kHz 

-5dB 




UNIT 58-3 



UNIT 53-I 



z 



a. 

3 

O 

(T 

(S> 






2I.4 kHz 
-5dB 





UNIT 59-6 



2I kHz 
-5dB 





UNIT 59-4 



CL 

3 

O 

cr 

CD 



I 



2I.3 kHz 
-5dB 



100 



50 



0 




sbmMC 



20 - 



10 



0 



UNIT 59-1 



Fig. 5. PST and INT histograms of tone burst responses. Conventional type 
arranged in the middle (unit 59-6 ) , extreme cases of deviation represented 
at top and bottcsn respectively (units 58-3 and 59-1) . Formation of Group 
N and P according to bars: Group N is centered at "normal” pattern (full 
part of bar) and Group P is centered at "peaked” pattern (allowance for 
response variability is indicated by eirpty parts of bars) . Stimuli: 
frequency and level of tone bursts as indicated; level in dB re 200 V 
p-p into earphone; burst duration 25 msec, repetition rate 10 per sec, 

2.5 msec rise and fall time. Zero time of PST histograms is 2.5 msec 
prior to the onset of the electric stimulus to the earphone. Length of 
runs is 1 min. 




200 



Kohlloffel: HEXX)RDINGS FBOM. SPIRAL GANGLIOT NEURC»IES 

There is also a cx>nspicuous difference in dead time in the INT histoqraitB 
of the tone burst responses with the INT histogram of unit 52-18 shewing 
the longer dead time. (Dead time denotes the interval without spike 
entries in the INT histogram) . 

The spectrum of response types obtained in the study is displayed in the 
array of Fig. 5. Patterns are arranged in the order of increasing 
prominence of the initial peak seen in the PST histogram of tone burst 
responses. Thus at the top and the bottom of the array we find the 
extreme cases of deviation from the norm of auditory fibre responses, 
here represented by unit 59-6. Unit 58-3 at the top shews a PST histogram 
with the initial peak lacking altogether ; instead the initial portion of 
the PST histogram is sloped. At the bottom we see the peaked type of 
pattern represented by unit 59-1. - From the fact that drastically 
different patterns were recordable from neurones in the same preparations 
(see e.g. units 59-6, 59-4 and 59-1) the question arises ^riether response 
diversity allows for classification of neurones into different classes. If 
it were only for the extreme departures frexn the standard pattern class 
formation would appear to be straightforward enough. However there were 
also patterns apparently intermediate to the norm and the extremes (see 
units 53-1 and 59-4) , a circumstance that causes doubts as regards class 
formation. Thus one could argue that the array of PST histograms in Fig. 5 
suggests a rather different scheme to acccarmodate unit diversity, a 
scheme more akin to a continuum than to a set of separable classes. 

Nevertheless units were grouped tentatively into two different classes in 
such a way that allowance was made for the changes in the shape of PST 
histogran® with tone burst duration and repetition rate and also for the 
occasionally seen instability of the response patterns (Kohll5ffel,1974) . 
The bars in Fig. 5 illustrate this scheme. Group N contains units v^ch 
primarily shewed the "normal" response pattern (see unit 59-6) with an 
allowed deviation from the norm indicated by the empty parts of the bar. 
Group P cemprises units which predominantly shewed the "peaked" type of 
response with an cLllowed deviation lap to the "normal" pattern. 

Tone burst responses were recorded from 97 units in 24 different cats. 
Approximately 77% of units fell into Group N and approximately 23% fell 
into Group P. There was no discernible correlation of unit classes thus 




201 



Kohlloffel: REXJ)RDINGS FROM SPIRAL GANGLK^ NEURC^S 




formed with unit features such as spike 
waveforms, spontaneous rates, tuning 
curves and click response latencies 
except for the tendency of the dead 
time to assume values larger for P 
type units than for N type units. The 
last point is illustrated in Fig. 6 
v^ere the histogram of N and P type 
mits is given with the dead time in 
the INT histogram of tone burst res- 
ponses as the abscissa. 

The apparent lack of sharp contours 
between classes may be at least partly 
due to the criteria used in assigning 
units to different groups, criteria 
intended to allow for some variability 
in the response patterns. This is not 
the only reservation and the apparent 
absence of class differences with res- 
pect to tuning curves and click res- 
ponse latencies may result from, in- 
adequacies of the ejqDerimental set up 
mentioned before in the context of 
Figs. 3 and 4. In returning to Fig. 2 we 
can see fran the response to rare- 
faction clicks that the N type unit 



Fig. 6. Distribution of N and P 
type units according to dead 
time seen in the INT histogram 
of tone burst responses. Bin 
width is 0.5 msec. Represented 
are 75 units in Gcoip N and 22 
units in Groip P from 24 cats. 



52-16 led the P type unit 52-18 by 
approximately 0.4 msec despite of the 
fact that it was farther removed from 
the basal end of the basilar membrane 
by 0.27 mm. Without iirplying any gene- 



rality for this instance it is possible however that such differences may 



get masked by pooling units from different preparations. It is conceivable 



that pooling adds to smear boundaries between unit classes and it seems 
desirable to confine unit ocanrparisons to results from individual prepa- 
rations. However considering the low yield of units per preparation this 




202 



Kohlloffel: REX30RDINCS FBOM SPIRAL GANGLKM NEURONES 
appears to be difficult to realize. 

DISCUSSION ; A preparation was developed to allow recordings from spiral 
ganglion neurones through the round window in the cat's basal turn. The 
preparation offers the promising aspect for visual control over related 
regions in the ganglion and the basilar meittorane. A first attenpt to 
exploit this situation was undertaken by e^loring the relation of unit CF 
and click response latency versus mit location along the basilar 
merrbrane. - Miile the bulk of recorded neurone characteristics followed 
closely the patterns described for high CF auditory nerve fibres (Kiang 
et al. , 1965) ~ an aspect of the study expounded at length elsev^ere 
(Kohlloffel, 1974) - there were some units ^<^ch in part of their dis- 
charge properties departed significantly from the patterns known to hold 
for fibres. This concerns the long dead times seen in the INT histograms 
and the peaked PST histograms in response to burst stimuli. - Units were 
classified into two groups according to the shape of the PST histograms, 
however there was practically no noticeable relatiai of the tw3 classes to 
other aspects of unit firings except for the tendency of the dead time to 
be longer for units with peaked PST histograms. It appears that different 
means of stimulias delivery better matched to the task of recording from 
high CF units may turn out useful in the differentiation of unit behaviour 
in the basal turn. 

The intriguing fact of neurone behaviour not conforming to auditory nerve 
data appears to be relevant, particularly so in the context of 
Spoendlin's (1971) recent anatomical findings. However, v^ether 
differences in neurone morphology are to explain this fact cannot be 
deduced from the present study. 

ACKNQ^CjEDGBMENTS ; The e^q^erimental work reported here was conducted at 
the Eaton-Peabody Laboratory in Boston, Mass., U.S.A. I want to thank 
Dr. N.Y.-S. Kiang, Director of the Laboratory, and his oo-wcrkers for the 
most geneTOus help I have received in connection with the conceptual 
and practical difficulties of the project. 




203 



Kbhlloffel: REOOKDINC^ FPCM SPIRAL GANGLIOST NEURONES 

This work was supported by NATO Scholarship 430/402/734/2 and U.S. 
Public Health Service Grants 5 RDl NS01344, 5 POl 0014940^ and 
5 SOI RR05485. 



REFERENCES 

DOWBEN, R.M. , and ROSE, J.E. (1953): A metal-filled microelectrode. 
Science 118 , 22-24. 

GREENIOC©, D.D. (1961) : Critical bandwidth and the frequency coordinates 
of the basilar membrane. J.Acoust.Soc.Amer. 1344-1356. 

KIANG, N.Y.-S., mrpmm, T., TBOmS, E.C., and CLARK, L.F. (1965): 
Discharge patterns of single fibers in the cat's auditory nerve 
(The MIT press, Canbridge, Mass.). 

KIANG, N.Y.-S., SACHS, M.B. , and PEAKE, W.T. (1967): Shapes of tuning 
curves for single auditory-nerve fibers. J.Acoust.Soc.Amer. 42 , 
1341-1342. 

KIANG, N.Y.-S., MOXON, E.C. and LEVINE, R.A. (1970): Auditory-nerve 

activity in cats with normal and abnormal cochleas. In: Sensorineural 
hearing loss, G.E.W. li^lstenholme and J.Kni^t, Eds. (Churchill, 
London) , pp. 241-273. 

KIANG, N.Y.-S., and MD50J, E.C. (1972): Physiological considerations in 
artificial stimulation of the inner ear. Ann.Otol.Rhinol.Laryngol^ 81, 
714. ~ 

K0HLI/3FFEL, L.U.E. (1974) : A study of neurone activity in the spiral 
ganglion of the cat's basal tiam. To be submitted to Arch.klin.e>q>. 
Ohr.-, Nas.- u.Kehlk.Heilk. 

SCHUKNBCHT, H.F. (1983) : Neuroanatomical correlates of auditory sen- 
sitivity and pitch discrimination in the cat. In: Neural mechanisms 
of the auditory and vestibular systems, G.L. Rasmussen and 
W.F. Windle, Eds. (Charles C. Thomas, Springfield, 111.), pp. 76-90. 

SPOENDLIN, H. (1971) : Degeneration beha^our of the cochlear nerve. 
Arch.klin.ejqp.Ohr.-, Nas.- u.Kehlk.Heilk. 200 , 275-291. 

SVAETICHIN, G. (1951) : Analysis of action potentials from single spinal 
ganglion cells. Acta Physiol. Scand. M, Suppl. 86, 23-57. 




204 



COMMENT ON: Recordings from spiral ganglion neurones (L.U.E. Koh315£fel) 
E.F. EVANS 

Department of Communication, University of Keele, Staffs, U.K. 

In Fig. 2 of KohlBffel’s article, the frequency threshold ("tuning") 
curves obtained from two spiral ganglion cells are shown. It is claimed 
that these curves are "very similar to those reported for [cochlear 
nerve] fibres with high characteristic frequency". 

In our own experience in cats and guinea pigs, curves of the shape 
illustrated are indicative of damage to the cochlea, either from local 
mechanical damage to the cochlear partition or from impairment of the 
cochlear blood supply. Frequency threshold curves with a sharply tuned 
low threshold segment as well as a broadly tuned high threshold segment, 
as in Figs. 1-5 of my article in this volume, are characteristic of 
normal cochlear nerve fibres with CFs from about 3kHz to at least 45kHz. 
The FTC for a fibre with a CF of 45kHz is shown in Fig. 1 below. Sound 
stimuli were delivered by a half inch condenser microphone under similar 
conditions to those of Kohldffel (closed system; open bulla) . 




0.1 1 10 100 
Tone frequency (kHz) 



Fig. 1. Frequency threshold curve of cat cochlear nerve fibre with CF of 
45kHz Ordinate: relative electrical signal level to condenser driver. 
Represents approximate SPL at the tympanic membrane (-8, 0, +8, -4dB at 
1, 6, 12, and 40kHz, respectively). 




205 



ADDITIONAL REMARKS 



KOHLLOFFEL: In the ganglion there were also units with sharper tuning 

curves than those of Fig. 2. In addition to the references quoted in 
my paper I would like to draw your attention to the article by Kiang and 
Moxon (1974, J. Acoust. Soc. Amer. 620-630, Fig. 13). I am restricting 
comparison to fibre recordings done at the Eaton-Peabody Laboratory where 
the ganglion work was done too. This might minimize discrepancies of 
methodological origin (such as differences in acoustic systems) in the 
results. Of course the risk of surgical damage to the cochlea is greater 
for the spiral ganglion preparation than for the auditory nerve preparation 
It would be interesting to investigate possible effects of anoxia and 
your pharmacological agents on my shallow tuning curves from high CF fibres 



TONNDORF: The total removal of the RW membrane may have an effect upon the 

traveling waves ( = l.p. filter effect?). Did you check the ^ position 
(waveform!) of the click response? 

K0HLL5FFEL: I did check the threshold of the response and the CM to 

clicks before and after removal of the round window membrane and frequently 
while recording. The data discussed here are from preparations with 
normal and stable thresholds. The waveforms of these responses were ins- 
pected only grossly. — I am aware that there are many possibilities for 
uncontrolled disturbances of the investigated cochlear region. Such 
disturbances however need not necessarily manifest themselves asi changes 
in the potentials recorded by a wire electrode near the round window. As 
regards pressure across the partition, little or no effect due to des- 
truction, of the round window membrane could be found ( Nedzelnitsky, 1974; 
"Measurements of Sound Pressure in the Cochleae of Anesthetized Cats," 

Sc.D. Thesis, MIT; Figs. 4.15, 4.16) 




206 



CODING OF REPETITION NOISE IN THE COCHLEAR NUCLEUS IN 
CAT 

G.BOERGER 

Heinrich-Hertz-Institut , Berlin-Charlottenburg ,FRG 
Introduction 

It is necessary to assume that the detailed time structure of 
spike trains is a mediator of sensory information for cetain 
sensory properties (directional hearing) . For others it is 
probably so (perception of repetition pitch) . In these cases 
it is assumed that the fine structures of the stimulus time 
function and the spike train of individual neuronal elements 
are correlated in a distinct manner. 

In the following report the results obtained from the measure- 
ment of such time parameters of the spike train will be set 
out. The single neuronal elements of the cochlear nucleus in 
the cat were monitored by means of acoustic stimulation with 
short clicks, white noise and repetition noise. 

Special attention was paid to those units whose stimulation 
response behavior points to a broad tuning. 

Methods 

The data for this report were gathered from adult cats who 
showed no sign of ear damage (inspection of tympanic mem- 
brane) . 

The surgical procedures and pharriiaco logical treatments used 
were similar to those described by Kiang et al. (1965). 
Anaesthesia was performed by intraperitoneal administration of 
sodium-pentobarbitone (32 mg per kg body weight) . In addition 
a glucose salt solution (5 ml/kg body weight* hour) was deli- 
vered intraveneously . Access to the cochlear nucleus was made 
possible after partial aspiration of the dorsal cerebellum 
and its retraction from the brain stem. The acoustic stimuli 
were delivered and monitored by condenser microphones using 




207 



Boerger: CODING OF REPETITION NOISE 

a closed system, similar to that described by Kiang (1965). 

The following were used as stimuli: 

§ a series of clicks^ repetition rate = 30/sec, width 

of the electrical signal in the earphone = 100 /iS 
§ continuous white noise 
§ continuous repetition noise 
R(t) = N(t) ± N(t+At) 

+ sign: "0-repetition noise" 

- sign: "tt - repetition noise" . 

The notations of the levels L refer to : 

§ 2 into the condenser earphone (B&K 4131) for clicks. 

§ 0.5 „ for noise signals, 

a rms ^ 

At 0.5 V a sine wave of 1 kHz produces an SPL of about 
rms ^ 

90 dB in front of the tympanic membrane. 

Recording from a single neuronal element was performed with 
electrolyte-filled glass micropipettes. This electrode type 
was chosen because this method permits the certainty of recor- 
ding a single neuronal element. Because many elements of the 
cochlear nucleus have no spontaneous activity , this type of 
reliable discrimination, particularly in the noise signals, is 
important. In addition this type of electrode causes minimal 
damage to the nervous tissue. 

Unfortunately this electrode registers the discharges from the 
cells in the same way as those from the fibres. For that rea- 
son the results cannot unequivocally be assigned to second 
order neurons. 

The spike discharges were amplified (passband 1 10 kHz) 

and analysed by a PDP-1 2 computer . 

The relationship between the applied stimuli and the way the 
data are presented is summarized in table I. 

In calculating the autocorrelation function (ACF) , the spike 
train is replaced by a discrete time series of zeros and ones 
in the same way that post stimulus time- and interval histo- 




208 



Boerger: CODING OF REPETITION NOISE 



Table 


I 


stimulus 


data presentation 


click 


PST-histogram 


continuous white noise 


impulse response by 




triggered correlation 


continuous repetition noise 


autocorrelation function 



grams are calculated. In this way the ACF can be seen as a to- 
tal of the interval histograms on the order of k=0, 1 , 2 . . . (The 
conventional interval histogram is k=1 ) . 

The constituent for k=0 was omitted in the ACF because it only 
gives the total count of the spikes under investigation. 

From unit labels the following can be identified: 

§ first 3-digit number: the cat 
§ second 1 -digit number : the neuron 
§ third 1 -digit number: the test run . 

Results 

In the following section the discharge patterns of units are 
shown which are typical with regard to the anticipated charac- 
teristics. This does not exclude the possibility that other 
units possess these qualities, if only incompletely. 

Units belong to this group whose discharge pattern by click 
stimulation on the PST-histogram show a significantly small 
scattering of latency. The PST-histograms in these cases were 
for the most part unimodal at lower and moderate levels. At 
higher levels, further peaks could be achieved - with however 
a large latency scattering. Figure 1 shows the PST-histogram 
of a unit with precise timing (binwidth 2.5 ^s) . The origin 
of the abscissa lies far outside the descriptive range. The 
latency measured from the beginning of the elctrical signal to 
the peak amounts to 3.29 ms. The width of the peaks at 50 % 
down amounts to about 60 /us. The noise of the electrode creates 




209 



Boerger: CODING OF REPETITION NOISE 

an additional uncertainty as the event times are determined so 
that the true width could be even smaller . 




Fig.1 PST-histogram of 

clickresponse , L=-40dB , 
latency = 3,29 ms , 
unit 162-3-2, 
characteristic frequency 
(CF)=1 ,3 kHz 



Stimulation with repetition noise leads to the accumulation of 
certain spike distances which are shown in the autocorrelation 
function of the spike train. 0-repetition noise makes the ACF 
of this unit to show the absolute maximum value when r = At 
(Fig. 2) . 

I : Fig. 2 Autocorrelation function, 

0-repetition noise, L=-40dB, 
At = 3,16 ms , 
unit 1 62-3-3 , 

CF = 1 , 3 kHz 

0 2 4 6 8 10ms 




The rest of the maximum values are grouped in multiples of the 
reciprocal characteristic frequency (CF) . 



Another example of a unit which reflects the delay time of 
repetition noise through its discharge pattern is given in 
figure 3. In the case of tt - repetition noise the ACF is mini- 
mal when r=At (Fig. 3b) . 



The prominence of stimulus induced discharge distances in com- 
parison to those produced by the action of the cochlear fil- 
ter has been observed only when the CF was below 3 kHz. This 
characteristic was found in only 6 elements from 49 units 
with characteristic frequencies below this limit. 




210 



Boerger: CODING OF REPETITION NOISE 

: : ; ; Fig. 3 Autocorrelation function, 

0-repetition noise, L=-40dB, 
At = 4,9 ms , 
unit 1 07-1 -1 , 

CF = 0,85 kHz 

0 2 4 6 8 10ms 

TT-repetition noise, 
unit 107-1-2, 
other parameters as in 
fig. 3a 

0 2 4 6 8 10ms 

Examples for the more frequent case are shown in figures 4 
and 5. In this case the discharge parameters are similar to 
those of primary fibres. Click stimulation produced a multi- 
modal PST-histogram with considerably broader modes . For com- 
parison with fig. 1, the first mode is selected and shown in 
fig. 4. The time scale is the same as in fig. 1. 

1 : : Fig. 4 PST-histogram of 

clickresponse,L=-30dB, 
latency = 3,76 ms, 
unit 1 53-4-2 , 

CF = 1,0 kHz 

0 1 2 3^^ 4 5-125^s 

Stimulation by repetition noise makes discharge distances 
most prominent, which can be explained by the instrinsic cha- 
racteristics of the elements (Fig. 5). 

Fig. 5 Autocorrelation function, 
0-repetition noise ,L=-40dB, 
At = 3,16 ms , 
unit 153-4-4, 

CF = 1 ,0 kHz 

0 2 4 6 8 10 ms 








211 



Boerger: CODING OF REPETITION NOISE 

The position of identified units from which recordings were 
made could only be roughly determined. It was concluded from 
the place, direction and depth of the micro electrode, that 
the units which were tested belonged either to the dorsal coch- 
lear nucleus or to the anterior ventral cochlear nucleus. 

Discussion 

This research into the measurement of discharge parameters was 
originally undertaken in order to determine whether the cha- 
racteristics of repetition noise can be represented by the dis- 
charge parameters of a single unit. 

Previous work has shown that the time structure of repetition 
noise can no longer be seen in the discharge parameters of pri- 
mary fibers (Boerger, Gruber, 1971). That the timing of dis- 
charges is governed by oscillations of the basilar membrane at 
the CF of the unit suffices to explain this phenomenon . With 
regard to the primary fibers, analysis, which is performed by 
the peripheral ear takes place predominately in the frequency 
domain. 

If we consider the possibility of a convergence of differently 
tuned primary fibers on one secondary element we may assume 
that the timing of discharges from primary elements is blurred 
out by virtue of the postsynamptic integration. 

On the other hand, this integration was expected to bring the 
common timing of all participating fibers into prominence. This 
assumtion is verified by the results, given in figs. 2 and 3. 

We observe that those spike distances which are determined by 
the stimulus are most prominent. 

We may call these neuronal elements "time domain units" (TDU) . 

Occasional measurements of click PST-histograms from these 
units indicate that individual discharges are exceptionally 
precisely time locked with the acoustic click (Fig. 1). 

Mditional justification for the label TDU comes from the fact 
that TDU*s show more rapidly decaying impulse responses than 




212 



Boerger: CODING OF REPETITION 
others (Pig. 6) . 




Fig. 6 Impulse response by 

triggered correlation, 
white noise, L=-40dB, 
unit 107-1-3, CP = 0,85 kHz. 
This is the same unit as 
shown in fig. 3. 



NOISE 




Impulse response by 
triggered correlation, 
white noise, L=-30dB, 
unit 1 53-4-1 ,CF = 1,0 kHz. 
This is the same unit as 
shown in figs. 4 and 5. 



The impulse response was determined by applying the method of 
triggered correlation (de Boer, Kuyper, 1968). 

The question of whether these units are engaged in directional 
hearing or even in the perception of repetition pitch is no 
longer a matter of physiological facts and is of course an 
open one. 



References 



de Boer, E. , Kuyper. P . (1968). "Triggered Correlation,” 

IEEE Trans, on Bio-Med. Eng., BME-15, 169-179. 

JBoerqer. G. , Gruber . J . (1971). "Codierung von Rauschsignalen 
durch das periphere Gehor der Katze,” 

Proc. Seventh Int. Congr . Acoustics Budapest, 497-500. 

Kianq . N . Y . -s . . Watanabe. T.. Thomas, E.C. and Clark, L.F. (1965). 
"Discharge Patterns of Single Fibers in the Cat's Auditory 
Nerve,” MIT Res. Mono. No. 35.. 




213 



COMMENTS ON: "Coding of repetition noise in the cochlear nucleus in cat" 
(G.BOERGER) 

J.P. WILSON 

Department of Conmuni cation. University of Keele, Keele, UK 

I would like to mention the close agreement we have observed between 
psychoacoustical measurements in man and single cochlear nerve fibres re- 
cordings in cat using the same comb-filtered noise stimuli as Dr. Boerger. 
In the cat experiments (Wilson and Evans, 1971; Evans and Wilson, 1973) 
the stimulus delay (At) was set so that a peak (N = At x CF) in the comb- 
filtered noise spectrum coincided with the characteristic frequency (CF) 
and the response noted; the spectrum was then 'inverted' (n noise) so 
that a spectral dip corresponded with the CF. This was repeated for 
successively higher peak numbers (longer delays) until the fibre no longer 
responded differentially to peaks and dips. By 'calibrating' the nerve 
with white noise steps it was possible to express the results as resolved 
contrast in dB. These contrasts ranged from 10 - 20 dB for widely spaced 
patterns (N = The limit of resolution which lay at peak num- 

bers from about 3 for CFs below 500 Hz to 10-12 for high CFs. A psycho- 
physical experiment in which the same stimulus was observed via one- 
third octave filters and the delay increased until the component peaks 
could not be resolved gave a similar limit at high frequencies below 
4 kHz, presumably due to temporal features in the stimulus. A recent 
experiment (Wilson and Seelman, in preparation) in which a tone was masked 
alternately by the same spectral peaks and dips, however, gave masking 
level differences which ranged from 10-20 dB for broad patterns (N=|-) 
to zero at the limit of spectral resolution. These masking level dif- 
ferences correspond conceptually with the resolved contrast values from 
cat cochlear fibres and agree numerically for all values of N at the same 
frequency. Limiting values of N ranged from 2 at 100 Hz to 10 above 1 kHz 
and again corresponded exactly with the neural values. In the psycho- 
physical case, however, this limit was invariant from 40 dB SPL (the 
minimum level giving sufficient masking) to 120 dB SPL. 




214 



COMMENTS ON: Coding of repetition noise in the cochlear nucleus in cat 

(G.BOERGER) 

F.A.BILSEN 

Applied Physics Department, Delft University of Technology, Netherlands 

In his paper Boerger raises the question whether temporal coding of 
"repetition noise" in T.D.U. -units is relevant for the perception of Repe- 
tition Pitch (RP) . To elaborate this question basic to our understanding 
of pitch mechanisms, let us summarize the main facts on pitch from psycho- 
physical experiments and compare these with data from electrophysiology. 

RP corresponds to 1 /x (equivalent Hz) if x is the delay time between 
continuous white noise and its added (delayed) repetition. Like periodicity 
pitch (residue pitch, virtual pitch, etc.) it shows the first effect of 
pitch shift. Detailed explanations are possible in terms of "temporal fine 
structure detection at the peripheral level" as well as in terms of "(an)- 
harmonic spectral pattern recognition at a more central level" (Bilsen and 
Ritsma, 1969; Bilsen and Goldstein, 1974). 

In favor of the latter hypothesis are experimental findings such as the 
existence of periodicity pitch for dichotic two-tone complexes (Houtsma and 
Goldstein, 1972) indicating that low pitch is derived from stimulus components 
that are spectrally resolved. This is in agreement with the spectral-dominance 
phenomenon indicating the relative importance of the lower (resolved) harmo- 
nics around the 4^^ (for RP see Bilsen and Ritsma, 1969). Also the finding 
that, for X > 3 ms, a similar pitch (Dichotic Repetition Pitch) is perceived 
if white noise is presented to one ear and the delayed noise to the other ear, 
seems compelling evidence against the classical hypothesis of temporal fine 
structure detection * , 

For DRP no spectral cues,nore temporal cues are available at the peripheral 
level of either of the ears (Bilsen and Goldstein, 1974). From these experi- 
ments it is concluded that known binaural and monaural phenomena on pitch of 
complex sounds appear to be compatible with a generalized place theory of 

The classical hypothesis of temporal fine structure detection considers the 
preservation of the fine structure of a complex waveform on the basilar mem- 
brane in the pattern of Vlll-nerve spikes. For pitch extraction, there still 
is no conclusive evidence against the use of temporal information carried by 
synchronous spike trains from the spectrally resolved harmonics of a complex 
waveform. 




215 



COMMENTS 



pitch in which a central pitch mechanism reads across the frequency dimen- 
sion of a centrally represented spectrum ("central spectrum"). 

Evidence from electrophysiology giving support to the spectral theory 
is provided by ten Kate et al. (1974). They found that the response of 
cochlear nucleus units in cat to "repetition noise" shows periodic fluctua- 
tions with relative maxima at n/CF (CF is the unit’s characteristic fre- 
quency; n is a positive integer). An absolute maximum is often observed for 
n= 3 or 4. Fig. 1 of "comments on masking patterns and lateral inhibition 
( Bilsen , this symposium)" shows a typical result. Viewed upon as a "neural 
spectrum" of ‘‘repetition noise", this figure is highly suggestive in "explai- 
ning" two important aspects of pitch perception, viz. a) spectral dominance 
of harmonics around the 4^^, and b) absence of pitch for the higher harmonics 
above about the 10^^ (spectrally unresolved). 

In the light of this psychophysical and electrophysiological evidence, 
Boerger’s suggestion on the role of timing for pitch perception seems highly 
disputable. Maybe, on the other hand, his finding is relevant for the sensa- 
tion of a "periodic rattle" evoked by repetition noise for about t > 20 ms. 
Boerger’s statement that TDU’s never were found for CF’s below 3 kHz rather 
supports the latter than the former possibility. For temporal mediation of 
RP one would expect TDU’s with a CF up to at least 10 kHz, since the RP-exis- 
tence region has its spectral limitation above 5 kHz (with a conversion factor 
cat-human of 1.8 (see Zwicker, this symposium) this translates into 10 kHz). 

REFERENCES 

Bilsen, F.A. and Ritsma, R.J. (1969/70), Repetition Pitch and its implication 
for hearing theory. Acustica 63-73. 

Bilsen, F.A. and Goldstein, J.L. (1974), Pitch of dichotically delayed noise 
and its possible spectral basis. J. Acoust .Soc.Am. 55 , 292-296. 

Houtsma, A.J.M. , and Goldstein, J.L. (1972). The central origin of the pitch 
of complex tones: evidence from musical interval recognition. J. Acoust. 
Soc.Am. _5L> 520-529. 

Kate, J.H. ten, Bilsen, F.A., Raatgever, J. and Buunen, T.J.F. (1974), Single 
unit responses in acoustig^nuclei of cat to noise and its attenuated re- 
petition. Accepted for 8 I.C.A., London. 




216 



INFORMATION PROCESSING IN THE HIGHER PARTS OF THE AUDITORY PATHWAY 
W.D. KEIDEL 

I. Physiologisches Institut der Universitat Erlangen-Niirnberg 
Erlangen, FRG 



While in the visual system the role of the cortex for the decoding 
processes is well known, and even highly sophisticated neurons have 
been detected (Hubei and Wiesel), the organizational structure 
within the auditory system seems to differ markedly in so far as 
obviously quite large parts of the decoding functions in audition 
are located within the medial geniculate. We do not know what in 
detail the cortex adds by its performance to that basic function at 
thalamic level. But it can be speculated that it is rather the great 
storage capability of the auditory cortex than its additional de- 
coding processes which make it up. 

Besides that it could be shown (David, Finkenzeller , Kallert and 
Keidel - 7i 8, 9i 12, l6, 17, l8, 19 i 20, 22) that in the next lower 
level of the auditory central part, namely at colliculus level, 
neuronal networks act together so that a formal harmonic analysis 
will be performed. Therefore this part of the auditory channel might 
be of some importance for our ability to hear musics (and sometimes 
- dependent upon the composer - even to enjoy it). A nice prove for 
that theory which could be realized by some experimental work, is 
the fact, that people having the absolute pitch change their pitch 
level to lower judges when their brain’s temperature has risen 
temporally e.g. in the course of a flue. This is consistent with the 
theory in so far as the constant and invariant clock acting as pace- 
maker for that neuronal net obviously depends upon temperature shifts 
of the brain’s blood supply rising its ’’characteristic” periodicity 
with increasing temperature according to the functional connection 
between the speed of chemical reactions and temperature. Although our 
laboratory reported about those structures elsewhere more in detail, 
in the first drawing a sketch might be shown to explain this 




217 



Keidel: INFORMATION PROCESSING 

network's function. This too makes clear what an important role both 
the colliculus and the geniculate play for the auditory decoding 
processes. 




Figo 1: Bottom row: PST-histograms of a single unit to tone burst 
(medial geniculate) same intensity, different frequencies 
increasing from left to right as indicated. 

Top; Frequency response curve of the same unit. 

Ordinate; Integral value of the histograms shown at bottom. 
Abscissa: Frequency of the tone bursts. 

The frequency-response curve shows clear maxima of the 
spike rate at the integer multiples of the fundamental 
frequency fQ. For comparison a typical single peak response 
curve of a neuron in the colliculus is shown in the insert. 
For explanation a model has been described in which an in- 
variant clock's periodicity is compared by the neuron with 
the demultiplicated frequency of the auditory stimulus. By 
varying the stimulus frequency and so also of the demulti- 
plicated periodicity of this neuron its output activity is 
enhanced maximally for the cases of coincidence of the 
variable and invariable periodicities (7, l4, I?). 




218 



Keidel: INFORMATION PROCESSING 

An extensive collection of the recent literature on microelectrode 
studies of the central part of the auditory pathway in vertebrates 
has been performed by Kallert in 1973 (l3)- In addition he succeeded 
in the development of a telemetric technique for recording single 
unit activity in the awake cat by means of a special subminiature 
device. Those experiments have been preceded by quite a large number 
of papers dealing with microelectrode studies on the anesthetized 
cat during the last decade (5i 6 ). All details about that are 
described elsewhere and yielded some interesting results part of 
which are in good agreement with Aitkin, Dunlop, Webster (l, 2 , 3, 

4, 26 ) and others with regard to the different types of histograms 
revealing on-, off- as well as sustained and periodic activities at 
that level. Those types of auditory neurons at higher levels in 
general can be subdivided into primary-like, chopper- and two other 
types which showed up just the inverse temporal patterns of the 
histograms mentioned above. Accordingly in the next figure ( 2 ) a 
comparison of all sorts of higher neuronal activity is compiled 
which is somewhat identical for all levels of the auditory system 
above the spiral ganglion and the cochlear nucleus. In so far all 
experimental data clearly converge, if one compares the data of the 
different laboratories. A special study and review on this subtopic 
was given recently by M^^ller (24) and by Keidel ( 23)0 

Kallert *s studies, however, led us ahead especially in so far, as he 
and his coworkers could show, that the geniculate level even in the 
cat seems to be specialized linguistically in a surprisingly high 
degree so enabling this part of the auditory channel for decoding 
abilities which have been thought of as due only to cortical level 
before. This is true not only for the awake state of cats - as this 
would have been expected - but even for the slightly anesthetized 
animal. The main results of this group can be summarized as follows: 

1.) There exist neurons at geniculate level, which are characterized 
by multiple peak response histograms. Hence the maximum firing 
rates in those neurons are related in a simple numerical 
sequence of the multiple peaks just the same way as the 




219 



Keidel: INFORMATION PROCESSING 



s 


la 




ti 

C 

(f 

e 


u ] 




ft 


I jAh 


1 1 ' 


I 1 


IlMIIMMII ' 


flhn 



Figo 2 : Typical types of 
temporal patterns 
obtained in the PST- 
histograms at higher 
levels of the auditory 
pathway (8, 22). 



fundamental and its formants in vowels and since the 
corresponding range for different neurons is tuned the same way 
as the corresponding fundamentals in the vowels u-o-a-e and i 
with increasing characteristic frequencies, it is likely to 
consider those special types of neurons as vowel-detectors , as 
it was first described by Keidel (20, 22). In the next figure 
( 3 ) a few examples of those vowel-^detectors are demonstrated. 



2.) Another type of single units at geniculate level did not respond 
to sinusoidal tones at all in contrast to type (l.) described 
just above. They rather fired, when stimulus frequency was 
changed continuously and linearly (as well as logarithmically) 
from low to high frequencies and back again. The range of 
frequencies within which those neurons could be activated was 
just in the order which is covered linguistically by the 
consonants of speech. A few examples for that type of cells is 
shown in the next figure ( 4 ) . 




220 



Keidel: INFORMATION PROCESSING 



medial geniculate of the cat 
on sustained off 



6 0 




a?5Hz 9.7 \ \100\{2 57 kHz 

755 Hz 57 kHz 

stimulus: sinus bursts frequency in Hz 



on sustained off 
SO 200 50m 



unit MOW 

, y " 



^ 50 sm 50ms 




8 50 too 50ms 




V ®| Iflj W 

irw 



on sustained off 
50 m 200m 




50 150 100ms 

unit M 301 



..iU iiljll ■ t.,1, 

50 300 50 ms 

unitN20i 




Fig. 3: Vowel-detector-unitSo 

Left: Frequency response curve for different temporal 
sections of the PST-histograms of the same unit (on- 
duration-off; geniculate). 

Right: Same as left for different units showing different 
accentuation for the different temporal sections (ll, 20, 
22 ) . 



medial geniculate of the cat 



000000 HZ 





750 Hz 755 Hz 

PST-Histogram 

stimulus frequency modulated 
sinus bursts 0,1-10-0,1 kHz 





Fig. 4: Consonant-detector- 
units. 

Left: PST-histogram of a 
single unit at medial genicu- 
late. Bar in the middle 
duration of a tone the 
frequency of which varies from 
100 Hz to 10 kHz and back as 
indicated below. Symmetrical 
type. The histogram is based 
upon 50 identical runs. 

Right: Same as left for 
different units (ll, 20, 22). 



221 



Keidel: INFORMATION PROCESSING 



In addition, a second subgroup of those neurons did not respond 
symmetrically to increasing and decreasing frequencies, with other 
words to a relatively slow frequency-modulation type of stimuli, but 
revealed a clear cut asymmetry for rising and decreasing frequencies 
of stimuli respectively. Such neurons therefore corresponded to 
phonemes of a type which is represented in transients in speech 
which differ the same way as i-o and o-i in words like ”jot” and 
"oil” or as i-a and a-i in words like "yeah" and "like". Keidel 
(20, 22) therefore suggested to lable them " consonant-detectors " for 
the subgroup 2 a and " transient-detectors" for the subgroup 2 b, 
examples of which are given in the next figure 5 * 



unit M912 



unit M 915 




^kHz 



0 ' m ' ^ms 



Fig. 5 - Transient-detector- 
units. 

As figure 4 , but unsymmetrical 
histograms. 

Left: A single unit of this 
type. 

Right: Several different units. 
In fact there is a wide 
spectrum of response types 
between the two extremes 
shown in figure 4 and 5 
corresponding to visible 
speech patterns of phonemes 

(11, 20, 22). 



Although it could be objected against that interpretation of 
neuronal activity at geniculate level, that the cat's brain might 
not be specialized to phonemes in the human voice, it is trivial 
that dogs and probably also cats at least behaviouraly clearly react 
to human voices of their keepers and racers. Certainly, whether 
those animals can "understand" what is spoken, is another question. 
But it seems very likely that higher rank animals in the group of 
vertebrates, moreover the primates and especially man himself might 




222 



Keidel: INFORMATION PROCESSING 



have some additional ability for understanding and using those 
physiological correlates of auditory neuronal activity to separate 
the different phonemes in speech and therefore to use those 
mechanisms of activation for the main decoding processes in speech 
communication by means of additional cortical pattern recognition 
systems or even without that helpo The latter idea might be supported 
by the wellknown brain ablation experiments of Neff (25) » Diamond 
(10) and others, whose results cannot be discussed her in detail. So 
the main role of the cortical layer of the auditory tract might be 
much more within the domain of storage of speech information, like a 
vocabulary of a foreign language and within the detecting of semantic 
information and its memorization rather than that of the basic de- 
coding processes, which can be performed at geniculate level as 
proved experimentally. Following this trail of working hypotheses 
Kallert and Keidel (15) succeeded in proving those ideas in experi- 
ments performed in unanesthetized animals with inhealed electrodes 
(Tungsten- type) and recording telemetrically in the awake and pain- 
free cat. 

The next figure ( 6 ) shows two examples for vowel-detection as well 
as for consonant-detector neurons. 

As one can see clearly the animal is able to distinguish as low as 
at geniculate level between the German words "fein”, ’’dein” and 
"mein” on one side and between the vowels "a", "e" and "i" on the 
other side. Those neurons did not respond to sinusoidal tones 
neither to simple frequency-modulated auditory stimuli. Another 
group of neurons at the same level, however, being nonresponsive to 
all sorts of linguistically relevant stimuli, have been found to be 
highly responsive to ecologically important noise of their 
surroundings, like the sound elicited by quickly moving mice and 
similarly complex sounds. The influence of the over-all situation 
of a given cat as well as the degree of attention which the cat gave 
to the situation clearly impinged the neuronal activity of a given 
neuron after the electrode had been inhealed in an animal. Details 
of those behaviouristic experiments can be read in detail in the 
habilitation paper of Kallert (l 3 )« Furthermore he found very 




223 



Keidel: INFORMATION PROCESSING 




Fein D e i n Mein Fein D e i n Mein 



mi 

I l ! ill tl I IN — If i t I "l I t f ( ' f" 

hm, 

0 e i n Mein Fein 



^ ^ ^ 

Mein Mein Mein Dein Dein Dein Fein Fein 




_ , , _ , . . - I 1 - i ^ 



Figo 6: Awake cat, medial geniculate© Upper trace: One single unit, 
sensitive to the consonant ”f” only. Below for comparison: 
another unit without any capability for separation of the 
used phonemes. Third trace (unit P2S3) : another unit sensi- 
tive only to the consonant ”d", same stimuli as in the upper 
trace. Bottom trace: Unit P2S4 able to distinguish between 
the different vowels, being sensitive only to the vowel ”a” 
(13, 21). 



interesting data as well on the increased spontaneous activity in 
the awake cat compared to the anesthetized, as special influence of 
inhibitory effects upon the time course of the PST-histograms at 
geniculate level, about which he reported in the same paper in 
detail • 

Summarizing it could be shown that the geniculate level of the awake 
cat is much more capable of performing highly complex decoding 
functions like those necessary for detecting and separating different 
phonemes in speech, than it was known and expected before based upon 
some elder and recent experiments. Particularly it was possible to 
prove that the vowel-, consonant- and speech-transient detectors at 
geniculate level having been found in the anesthetized cat, clearly 
could be recognized in the awake, freely moving cat using tele- 
metrical devices for recording the cat’s geniculate single-unit 




224 



Keidel: INFORMATION PROCESSING 



activity by means of Tungsten-microelectrodes with a peak diameter 
of about 1 micrometer. The set up of this device is shown schemati- 
cally in the last figure as well as a schematic drawing of the type 
of computerized information processing of the single unit activity 
used at our Institute. 




Fig. Block-diagram of the electronic set up and for the information 
processing procedure used in our Institute. 



References 

lo) Aitkin, L.M. (l 973 )j Medial geniculate body of the cat: responses 
to tonal stimuli of neurons in medial division. J. Neurophysiol. 
36 , 275-283 

2 .) Aitkin, L.M., Dunlop, C.¥. and Webster, WoR. (1966): Click-evoked 
response patterns of single units in the medial geniculate body of 
the cat. J. Neurophysiol o 29 ., 109-123 

3 «) Aitkin, L.M. and Dunlop, C.¥. (1968): Interplay of excitation and 
inhibition in the cat medial geniculate body. J. Neurophysiol. 3 1 i 

44-61 

4 . ) Aitkin, L.M. and Dunlop, C.¥. (1969)1 Inhibition in the medial 

geniculate body of the cat. Exp. Brain Res. 7 .? 68-83 

5. ) David, E. , Finkenzeller , P. , Kallert, S. und Keidel, ¥.D. (1968): 

Die Bedeutung der temporalen Hemmung im Bereich der akustischen 
Inf ormationsverarbeitung . Pfliigers Arch. 298 , 322-335 












225 



Keidel: INFORMATION PROCESSING 



6. ) David, E. , Finkenzeller , P, , Kallert, S. und Keidel, ¥,D, (1968): 

Die mit Mikroelektroden ableitbare Reaktion einzelner Elemente 
des colliculus inferior und des corpus geniculatum mediale auf 
akustische Reize verschiedener Form und verschiedener Intensitat. 
Pflugers Arch. 299 1 83-93 

7 . ) David, E. , Finkenzeller, P., Kallert, S. und Keidel, ¥.D. (1969) J 

Reizfrequenzkorrelierte "untersetzte” neuronale Entladungsperio- 
dizitat im colliculus inferior und im corpus geniculatum mediale. 
Pflugers Arch. 309 , 11-20 

8 . ) David, E. , Finkenzeller, P. , Kallert, S. und Keidel, ¥.D. (1969) ^ 

Die Antworten einzelner Einheiten in hoheren Horbahnanteilen der 
Katze auf kombinierte und auf frequenzmodulierte Tone. Pflugers 
Arch. 312 , R 130 

9. ) David, E* , Finkenzeller, P. , Kallert, S. und Keidel, ¥.D. (l 97 l)* 

Beitrage hoherer Horbahnanteile der Katze zur Mustererkennung . 

In: Zeichenerkennung durch biologische und technische Systeme. 
Hrsg.: Griisser, O.-J. und Klinke, R. 5 Springer-Verlag , Berlin- 
Heidelberg-New York 

10 . ) Diamond, I.T., Goldberg, J.M. and Neff, ¥.D. (1962): Tonal 

discrimination after ablation of auditory cortex. J. Neurophysiol. 

223-235 

11 . ) Kallert, S. (1972): Uber die Reizantwort einzelner Zellen im 

Corpus geniculatum mediale der Katze bei Untersuchung mit Mikro- 
elektroden. Dissertation, Erlangen 

12 . ) Kallert, S. (1972): Periodizitaten bei der zentralnervosen Infor- 

mationsverarbeitung . In: Mechanismen und Bedeutung schwingender 
Systeme. Hrsg.: Rensing, L. und Birukow, G. ; Vandenhoeck & 

Ruprecht, Gottingen 

13. ) Kallert, S. (± 973 ) • Telemetrische Mikroelektrodenuntersuchungen 

am Corpus geniculatum mediale der wachen Katze. Habilitations- 
schrift, Erlangen 

14 . ) Kallert, S. , David, E. , Finkenzeller, P. and Keidel, ¥.D. {1970): 

Two different neuronal discharge periodicities in the acoustical 
channel. In: Frequency Analysis and Periodicity Detection in 
Hearing. Eds.: Plomp, R. and Smoorenburg , G.F.5 A.¥. Sijthoff, 
Leiden 

15. ) Kallert, S. and Keidel, ¥.D. (l 973 )‘ Telemetrical microelectrode 

study of the upper parts of the auditory pathway in free-moving 
cats. Pflugers Arch. 343 ., R 79 

16. ) Keidel, ¥.D. (1968): Neuere Ergebnisse und Probleme der Physiologie 

des Horens. Hals-, Nasen- u. Ohrenheilkunde , Heft 20 , 9-27 

17 *) Keidel, ¥.D. (1969)^ Inf ormationsphysiologische Aspekte des Horens. 
Studium Generate ^ 2 , 49-82 

18.) Keidel, ¥.D. (l 970 ): Neuere Ergebnisse der akustischen Informa- 
tionsverarbeitung . In: Ergebnisse der experimentellen Medizin, 

Band 3 « Hrsg.: Presidium der Deutschen Gesellschaft fiir experi- 
mentelle Medizin; VEB Verlag Volk und Gesundheit, Berlin 




226 



Keidel: INFORMATION PROCESSING 



19 . ) Keidel, W.D. (l970): Optische und akustische Zeichenerkennung 

beim Menschen. Naturwiss. Rundschau 2^, 491-498 

20. ) Keidel, W.D. (l973)- Zeitliche und raumliche Aspekte der 

menschlichen Zeichenerkennung. In: Nova Acta Leopoldina Nr. 211, 
Bd. 38 , "Festschrift fur Bernd Lueken". Hrsg.: Mothes, K. und 
Scharf, J.-H.; Deutsche Akademie der Naturforscher , Leopoldina, 
Halle/ Saale 

21. ) Keidel, W.D. (l974): Neuere Ergebnisse der akustischen Informa- 

tionsverarbeitung im Zentralnervensystem. In: Kybernetik und 
Bionik - Cybernetics and Bionics. Hrsg.: Keidel, WoD., Handler, 

W. und Spreng , M. ; Oldenbourg Verlag, Mtinchen (im Druck) 

22. ) Keidel, W.D. (1974): Human Biocybernetics. In: Advances in 

Cybernetics Systems. Ed.: Rose; Gordon & Breach Science Publ., 
London (in press) 

23 . ) Keidel, W.D. and Kallert, S. (1974): Auditory Nervous System. 

In: Scientific Basis of Otolaryngology. Eds.: Harrison, D.F.N. 
and Hinchcliffe, R. ; William Heinemann Medical Books Ltd., 

London (in press) 

24. ) M/ller, A.R. ( 1972 ): Coding of sounds in lower levels of the 

auditory system. Quarterly Reviews of Biophysics 59-155 

25 . ) Neff, W.D. , Casseday, J.H. and Cranford, J.L. (1972): The medial 

geniculate body and associated thalamic cell groups: behavioral 
studies. Brain, Behav. Evol. 302-310 

26 . ) Webster, W.R. and Aitkin, L.M. ( 1971 ): Evoked potential and 

single unit studies of neural mechanisms underlying the effects 
of repetitive stimulation in the auditory pathway. Electroencepho 
din. Neurophysiol. 3^9 581-592 



ADDITIONAL REMARKS 

SCHWARTZKOPFF (addressed to Dr. Kallert, the reader of Dr. Keidel’ s 
paper) :lt seems to me that Dr. Kallert has described partially a very 
fundamental neuronal behavior in processing auditory information. Dr. 
Leppelsach in our laboratory recently finished a comparable though less 
extensive study in curorized birds. The PST histograms of telencephalic 
(ectostriatum) units show essentially the same transfer functions as 
found by Dr. Kallert - say of P, PD, D-type. Within one response pattern 
on-, off-, and tonic segments can be analyzed, varying independently, e.g. 
being inverted. Of special interest is possibly frequency response. A 
unit may show a complex response area, which is sub-divided by sectors of 
different response pattern (e.g. P/D type). 




227 



DYNAMIC PROPERTIES OF COCHLEAR NUCLEUS UNITS IN RESPONSE TO EXCITATORY 
AND INHIBITORY TONES 

A.R. M0LLER 

Division of Physiological Acoustics, Department of Physiology II, 
Karolinska Institutet, 10^^ 01 Stockholm 60, Sweden 

Introduction 

It is a well known fact that most single units in the cochlear 
nucleus, just as in other parts of the ascending auditory pathway, 
exhibit excitatory as well as inhibitory response areas. Usually the 
excitatory areas are surrounded by inhibitory areas , the one above the 
excitatory area usually being most pronounced. Thus activity evoked by 
a tone the intensity and frequency of which are within the excitatory 
areas ( e.g . a tone at CF and intensity above threshold) can be 
inhibited by a second tone the frequency and intensity of which are 
within the inhibitory areas of the unit. 

Most natural sounds have their energy distributed over a wide 
range of frequencies and are therefore likely to activate both 
inhibitory and excitatory areas of many units simultaneously. 

The functional importance of the arrangement of inhibitory and 
excitatory response areas has been studied only slightly but it is 
generally assumed that the inhibitory areas sharpen the excitatory 
response in such a way that a higher spectral resolution is achieved. 
This hypothesis is derived from results obtained by steady state tones*, 
the extension of the excitatory and inhibitory response areas has been 
described only by threshold curves. Recent neurophysiological results 
from the cochlear nucleus and superior olive do not lend support to 




228 



M^zJller: DYNAMIC PROPERTIES OF COCHLEAR NUCLEUS UNITS 

.2 

this hypothesis . Nonetheless, it is conceivable that this character- 
istic arrangement of inhibitory and excitatory areas may be of im- 
portance for coding of complex sounds the frequency and amplitude of 
which vary more or less rapidly. It is likely that the said arrange- 
ment of areas in cochlear nucleus units would enhance rapid changes in 
frequency if the inhibitory areas act more slowly than the excitatory 
areas. In order to study the role of the inhibitory areas, it is 
necessary to know the dynamic properties of the inhibition and to 
compare these with dynamic properties of excitatory stimuli. No such 
data are available. 

It has been shown earlier that sinusoidal modulation of the 

amplitude of a tone in a certain range of modulation frequencies gives 

rise to comparatively large modulation of the discharge rate over a 

large range of somd intensity in single units in the cochlear 
12 3 

nucleus ’ ’ . In previous studies it has furthermore been shown that 
linear system analysis can be applied in estimation of the frequency 
response functions of cochlear nucleus units in response to amplitude 
modulated tones as well as to noise- and frequency modulated tones. In 

these studies the tones and noise were modulated sinusoidally^ or with 

3 

pseudorandom noise . The dynamic properties were described by means of 
the dynamic transfer function for modulation of the sound to neural 
discharge frequencies. Such transfer functions express the ratio and 
the phase shift between (l) the modulation of cycle histograms of the 
responses and (2) the relative modulation of the sound stimuli as a 
fxanction of frequency. In many units the transfer function has a peak 
at a certain modulation frequency, usually located between 50-300 Hz. 




230 



M{z5ller: DYNAMIC PROPERTIES OF COCHLEAR NUCLEUS UNITS 

Fig. 1 Cycle histograms of the responses to sinusoidally amplitude- 
modulated tones locked to the modulation. Two tones, one 
inhibitory (5550 Hz) and one excitatory (^500 Hz) -were 
presented simultaneously. A: The excitatory tone was modulated 
and B: The inhibitory tone was modulated. The intensity of the 
excitatory tone was 55 dB SPL and of the inhibitory 65 dB SPL. 
The \init*s threshold for a single tone at CF was 30 dB SPL. 

In the present study two tones were presented simultaneously, one 
inhibitory and one excitatory. The frequency of the excitatory tone was 
equal to the unit*s CF and the inhibition equal to the best inhibitory 
frequency (BIF). The responses when either one of the tones was 
modulated sinusoidally were studied. 



Results 

Figure 1 shows cycle histograms of the responses to two tones. 

In A the excitatory tone (^500 Hz) was modulated and in B the 
inhibitory tone (5550 Hz) was modulated. The modulation was 30^ and 
the excitatory tone had an intensity of 55 dB SPL, the inhibitory tone 
65 dB SPL. It is seen that the modulation of the histograms is some- 
what less when the inhibitory tone was modulated, being reversed in 
phase at modulation of the excitatory tone. It is also seen that the 
shape of the histograms is almost sinusoidal. Figure 2 A shows the 
ratio between the relative amplitude of the modulation of the histo- 
grams and the modulation of the tones (in dB) when the excitatory tone 
was modulated (filled circles and solid lines) and when the inhibitory 
tone was modulated (filled circles and dashed line) for frequencies 
from 10 to 1500 Hz. Figure 2 B shows the phase angle between the 




PHASE ANGLE IN DEGREES 



231 



M?iller: DYNAMIC PROPERTIES OF COCHLEAR NUCLEUS UNITS 




1 2 5 10 20 50 100 200 500 1000 2000 



FREQUENCY IN Hz 



232 



MflJller: DYNAMIC PROPERTIES OF COCHLEAR NUCLEUS UNITS 

Fig. 2 A: Gain fimction showing the ratio between the modulation of 
the histograms in Fig. 1 and the modulation of the tones. The 
solid lines represent modulation of the excitatory tone and 
dashed lines that of the inhibitory tone. B: Phase angle 
between the modulation of the sound and that of the histogram. 
The curve representing modulation of the inhibitory tone 
(dashed line) was shifted l80 degrees to facilitate comparison. 

modulation of the tones (envelope) and the modulation of the histograms 
when the excitatory tone was modulated (solid lines) together with 
that when the inhibitory tone was modulated (dashed lines). The latter 
curve was shifted l80° to facilitate comparison. It is seen that the 
frequency transfer functions for modulating the respective tones are 
very similar except for the fact that modulation of the inhibitory 
tone gives about 6-8 dB less modulation of the histograms compared 
with modulation of the excitatory tone. In A is also seen that the 
relative gain of the second harmonic of the histograms (dashed and 
solid lines without symbols) has about the same value for inhibition 
and excitation, thus indicating that the major nonlinearity of the 
system is shared by the inhibitory and the excitatory input to the 
neurons . 

This type of response pattern is typical for most units in the 
cochlear nucleus. It can thus be concluded that the dynamic properties 
of the inhibitory areas are very similar to the dynamic properties of 
the excitatory areas. The inhibition is therefore not likely to be 
accomplished through any interneurons since that would introduce 
additional delay compared to the excitatory input, leading to a 
different phase characteristic for inhibition and excitation. Such 




233 



MjzJller: DYNAMIC PROPERTIES OF COCHLEAR NUCLEUS UNITS 




5050 



yr 




























7: 


f’ 




% 


4 


B50 




Fig. 3 Cycle histograms of the 
responses to amplitude- 
modulated tones. Two tones 
were presented simultane- 
ously. One was unmodulated 
at CF (i +150 Hz) and one 
modulated with varying 
frequency (indicated by 
legend numbers). The sound 
pressure of the tone at CF 
was 55 dB SPL. The varia- 
ble tone had an intensity 
of 65 dB SPL. The 
threshold of the unit at 
CF was about 35 dB SPL. 



was not the case. 

Figure 3 shows histograms of the responses to two tones when the 
frequency of the modulated tone was varied from above the upper limit 
of the inhibitory area towards lower frequencies. The tone at CF was 




234 



M^z$ller: DYNMIC PROPERTIES OF COCHLEAR NUCLEUS UNITS 



unmodulated. 



90 “ 




Fig. k Amplitude and phase of the modulation of the histograms in 
Fig . 3 in polar plot . 

The amplitude of the modulation of the histograms is shown in a 
polar plot in Fig. 4 with the frequency of the modulated tone indicat- 
ed hy legend numbers. Responses to two different sound pressure levels 
(55 and 65 dB SPL) are shown. A gradual shift of phase angle is 



235 



Mjiller: DYMAMIC PROPERTIES OF COCHLEAR NUCLEUS UNITS 





40 



50 



70 dB SPL 





236 



Miller: DYNAMIC PROPERTIES OF COCHLEAR NUCLEUS UNITS 

Fig. 5 Relative gain of a typical unit in response to tones modulated 
■with pseudorandom noise. Solid lines sho-w response to a single 
modulated tone at CF (.18.7 kHz); dashed lines and squares: an 
unmodulated inhibitory tone (21.135 kHz) -was added to the 
modulated tone and dashed lines and triangles: the inhibitory 
tone ■was modulated ■when the tone at CF "was unmodulated. Sound 
intensity of the tone at CF is given as abscissa. The in- 
tensity of the inhibitory tone "was 5 dB higher. Each data 
point is based on 5 min recording. The soxmds ■were presented 
during 10 sec followed by 10 sec of pause. 

revealed when the frequency of this tone is varied. 

The amplitude modulation of a sound above a certain frequency 

was coded in the discharge pattern of a unit over a large range of 
1 2 

so'und intensities ’ . That range was larger than that over which the 
discharge frequency was a function of the stimulus level. Usually the 
discharge frequency reaches a plateau about 20-30 dB above the 
threshold of these units. In the range of modulation frequencies 
between 50 and 300 the gain had high and relatively constant values 
over a sound intensity range of more than ko dB. 

Addition of an inhibitory tone to a modulated tone at CF 
increases the dynamic range with regard to modulation of most units 
and decreases the discharge rate. That is illustrated in Fig. 5 A 
which shows the maximal gain for modulation in the frequency range 
from 10-1500 Hz as a function of sound intensity of the excitatory 
tone in three different situations: a) one single tone at CF was 
modulated (filled circles, solid lines), b) an inhibitory tone at BIF 
(best inhibitory frequency) was added to the modulated tone at CF 
(dashed lines and filled circles), c) the inhibitory tone was 




237 



U(^ller: DYNAMIC PROPERTIES OF COCHLEAR NUCLEUS UNITS 

modulated when the tone at CF was unmodulated (dashed lines and filled 
triangles). The intensity of the inhibitory tone was 5 dB higher than 
the excitatory tone. (CF of the \init was l8.T kHz and BIF was 
21.135 kHz. ) 

The data shown in Fig. 5 were obtained using pseudorandom noise 
3 

modulation and the relative modulation of the tones was -15 dB 
relative 100^ RMS modulation. Recordings were made during 5 min for 
each of the three different situations and at each intensity. The 
tones were presented during 5 sec followed by 5 sec pause. It is seen 
that addition of an unmodulated inhibitory tone increases the relative 
gain for low intensities slightly and for high sound intensities con- 
siderably in such a way that the gain is about 5 dB or more over a 
^5 range whereas for a single tone at CF the gain decreases from 
8 to -4 dB in the same range. In this unit the modulation of the in- 
hibitory tone was almost equally efficient as the modulation of the 
excitatory tone. In many "units modulation of the inhibitory tone gives 
considerably less modulation of the discharge frequency. Figure 5 B 
shows the discharge frequency during the experiment illustrated in 
Fig. 5 A as a function of sound intensity. It is seen that the dis- 
charge frequency varies over a much larger range of so"und intensity 
when two tones are presented simultaneously compared with one single 
tone at CF. 

The results of the present study can thus be s"unmiarized: 

1. When two tones are presented simultaneously, one at CF and one at 
the upper BIF, amplitude modulation of any one of the two results in 
transfer functions of almost identical shape but the gain for 




238 



M5Z$ller: DYNAMIC PROPERTIES OF COCHLEAR NUCLEUS UNITS 

modulation of the inhibitory tone is usually somewhat lower than that 
of the excitatory. 

2. The modulation of the histograms of the responses to modulation of 
the inhibitory tone is shifted almost exactly l 80 ° compared with 
modulation of the excitatory tone. 

3. There is a characteristic change in phase of the modulation of the 
discharge frequency when an \inmodulated tone at CF is presented to- 
gether with a modulated tone the frequency of which is varied from 
above the best inhibitory frequency through and below the character- 
istic frequency of the unit whereas the magnitude of the modulation in 
many units is almost independent of the frequency of the carrier. 

k. The sound intensity range over which amplitude modulation of a 
tone produces a modulation of the discharge frequency is usually con- 
siderably extended when an unmodulated tone at the best inhibitory 
frequency is added. In many units the relative modulation gain in the 
range of modulation frequencies of 100-500 Hz is almost constant over 
a range of sound intensities of 4o dB or more. 

Discussion 

The results of the present study show that inhibitory and 
excitatory inputs to neurons in the cochlear nucleus have practically 
identical temporal integration and latency. Thus, there is indication 
that both types of inputs are transmitted to these neurons from the 
haircells in the cochlea through identical pathways. Furthermore, 
inhibitory and excitatory synapses on the neuron from which recording 




239 



M^zJller: DYNMIC PROPERTIES OF COCHLEAR NUCLEUS UNITS 

was made seemed to have identical temporal characteristics with regard 
to changes in sound intensity. 

The phase angle of the modulation of the discharge rate changes 
gradually from the frequency at which the tone was inhibitory to the 
frequency at which the second tone was excitatory. The implication 
may be that modulation of different components of a complex sotind is 
coded differently in the discharge rate of these units. The phase 
angle of the modulation of the discharge rate and the envelope of the 
sound are both functions of the frequency modulated component of the 
sound in relation to the characteristic frequency of the unit. The 
responses in two neighboring units with a slight difference in CF to 
time-varying sounds can thus be assumed to have a discharge pattern 
the modulation of which has a certain phase relationship. 

When interpreting the results of such studies it should be kept 
in mind that excitation and inhibition probably overlap in frequency 
above threshold and therefore cannot be separated. An excitatory tone 
at CF will accordingly also activate the inhibitory input to these 
units and vice versa. The discharge pattern of such neurons is thus 
most likely to be a result of both inhibition and excitation even when 
the stimulation is a pure tone at CF, at least for sound intensities 
well above threshold. The unit that has nonmonotonic stimulus response 
curves for stimulation with pure tones at CF may hence have an 
inhibition that increases more rapidly with increase in sound in- 
tensity than does the excitation above a certain soxind level. 




240 



M?Sller: DYNAMIC PROPERTIES OF COCHLEAR NUCLEUS UNITS 
Acknowledgement s 

This work was supported hy the Swedish Medical Research Council 
(Grants 04x-90 and 0i*P-325l). 

^A.R. Miller, "Coding of amplitude and frequency modulated sovinds in 
the cochlear nucleus of the rat," Acta physiol, scand. 86 , 
223-238 (1972 a). 

2 

A.R. Miller, "Coding in lower levels of the auditory system," 

(Juart. Rev. Biophys. 1, 59-155 (1972 h). 

A.R. Miller, "Statistical evaluation of the dynamic properties of 
cochlear nucleus units using stimuli modulated with pseudo- 
random noise," Brain Res. _|T .9 ^^3-^56 (1973)* 




241 



ROUGHNESS MTD ITS RELATION TO THE TIME - PATTERN OF 
PSYCHOACOUSTICAL EXCITATION. 

A. VOGEL 

Ins ti tut fiir Elektroakustik der Technischen Universitat, 
Miinchen, FRG 

Introduction 

The sensation of roughness is produced by sounds with a strong 
time structure f such as amplitude -modulated sounds. For single 
sinusoidally amplitude-modulated tones, the dependence of 
roughness on the important physical parameters (degree of mod- 
ulation, modulation frequency, carrier frequency, SPL) was sys- 
tematically studied by TERHARDT ( 1 968/a, b; 1974 ), but there are 
only few investigations about the roughness of more complex 
sounds . 

Therefore the following experiments on roughness were carried 
out to study the question of how the sensation of roughness 
could be composed of parts of it. The results will be discussed 
by means of the model of psychoacoustical excitation pattern 
(ZWICKER u. FELDTKELLER 1967). 

1 . Method 

The method of constant stimuli was used. Five subjects had to 
compare the roughness of the actual test sound with the rough- 
ness of an 1kHz tone, modulated sinusoidally in amplitude with 
variable degrees of modulation m^. The presentation was mon- 
aural using earphones (Beyer DT-48) . In the diagrams the medi- 
an and the interquartile ranges were plotted as results. The 

-5 2 

SPL is given by relating the sound pressure p to p^=2*10 N/m . 



2 . Measurements 

2.1 First the dependence of roughness of an amplitude-modulated 
tone on the carrier frequency was determined for modulation 
frequencies of degree of modulation 

of the test sound was m^=0,8 (SPL=80dB) . Fig. 1 shows the re- 
sults; the degree of modulation m^ of the 1 kHz tone for equal 
roughness is plotted as a function of the carrier frequency fip 




242 



Vogel: ROUGHNESS 




Fig. 1: Degree of modulation m^ of a 1kHz tone which produces 
the same roughness as a modulated tone (mijsOjS) with the carri- 
er frequency ftp. 

of the test sound. In this diagram, 3 important sections can be 
distinguished . 

For fip<1kHz the roughness decreases with increasing Thus 

the frequency selectivity of the ear - also found by TERHARDT - 
is confirmed; as consequences of a kind of filter system the 
interference between the components of the spectrum decreases 
with increasing distance of these components. For 1kHz<f^<4kHz 
nearly equal roughness for all f^ and all was found; this 

is in good accordance with former results as well. 

For f^>4kHz the roughness obviously decreases with increasing 
carrier frequency independent of 

In another experiment this effect proved to be dependent on 
level. Fig. 2 shows the result for f^^^=50Hz and f^>1kHz using 
SPL*s of 60dB and 90dB. The results for SPL=80dB are taken from 
Fig. 1 for comparison. 

As can be seen in the diagram , the higher the level the more 
the sensation of roughness decreases with increasing frequency 
fiji. Towards high frequencies the upper limit of the range of 
audibility seems to influence the sensation of roughness. If 




243 



Vogel: ROUGHNESS 




Fig. 2 : 

Degree of modulation m^ of a 
1 kHz tone which produces the 
same roughness as a modulated 
tone {mii'=0 58) with the carrier 
frequency f«p for various SPL’s. 



this assumption is true, an additional tone producing partial 
masking should produce similar effects on roughness. 



2.2 Therefore, in addition to an amplitude -modulated tone (f'| = 
1kHz; miji=0,8) , an unmodulated partial masking tone was presen- 
ted with variable frequency f 2 - The roughness of this sound was 
compared with the roughness of the unmasked 1kHz tone modulated 
with variable m^- and the same fj^Q^=50Hz as the test sound. Fig. 3 
shows the result of this experiment. The degree of modulation 
m^ for equal roughness , related to the degree of modulation m,j, 
of the partial masked 1 kHz tone, is plotted as a function of 
the frequency distance Af=|f 2 “f-|| expressed in distance of crit- 
ical bands Az= |z 2 -z^| . 

In the case of f 2 >f^| and SPL=80dB a remarkable influence of the 
partial masking tone on the roughness of the test sound can be 
stated (open circles) . The decrease of roughness with decreas- 
ing Az can be characterized as a kind of throttling (partial 
masking) of roughness, similar to the throttling (partial 
masking) of loudness. For SPL=60dB this influence is signifi- 
cantly smaller (open squares) , just as for f 2 <^i SPL=80dB 

(closed circles) , as to be expected. 



2.3 Similar to a former experiment (TERHARDT 1974) the rough- 
ness of 2 sinusoidally modulated tones was determined. The fre- 




244 



Vogel: ROUGHNESS 




Fig. 3: 

Degree of modulation of 
the 1kHz tone, related to 
the degree of modulation mij 
of a partial masked 1kHz 
tone, for equal roughness as 
a function of the distance 
Az= I Z 2 ”Z -j I of an unmodulated 
partial masking tone ( 12 )* 



quency of the lower tone was f*]=500Hz, the frequency of the 
higher tone f 2 was varied (fnio^=50Hz) . Each of the tones was 
modulated with itVp=0,63 and presented with a SPL=70dB. As a 
result, the degree of modulation m^ of the 1kHz tone which is 
as rough as the modulated sound is plotted in fig. 4 as a func- 
tion of the frequency f 2 * Parameter is the phase configuration 
of the modulation in the two-AM-tone signal. In one case (open 
circles) the 2 tones were modulated co-phasic, in the other 
case (closed circles) they were modulated anti-phasic, so that 
one AM-tone has its maximum at the same time as the other has 
its minimum. 



Fig. 1+: 

Degree of modulation m^ 
of the 1kHz tone -which 
produces the same rough- 
ness as 2 modulated tones 
(mip=0,63) as a function 
of the frequency fg of 
the higher tone. 

Open circles: co-phasic, 
closed circles: anti- 
phasic modulation. 




0,5 



1 



2 



U kHz 8 



245 



Vogel: ROUGHNESS 

In the case of anti-phasic modulation , the roughness of the test 
sound is nearly equal to the roughness of the IkHz tone modu- 
lated with the same degree of modulation, independent of f 2 - If 
the 2 tones are modulated co-phasic, the degree of modulation 
m-^ had to be increased, compared with the anti-phasic modula- 
tion, by a factor of 1,25 for 0,8Hz <f2<2kHz. 

Applying the square law 

( 1 ) 

between the roughness r and the degree of modulation m 
(TERHARDT 1968/b) , this increase expressed in roughness reaches 
a factor of 1,6. For f 2 > 2kHz this factor diminishes, because 
the roughness of the modulated tone with the frequency f2 de- 
creases (fig.1). 

For a frequency f 2=1 kHz the influence of a sound pressure level 
difference between the 2 modulated tones on the roughness of 
this sound was measured. The result is shown in fig. 5. 




30 40 50 60dB70 

Fig. 5: Degree of modulation of the 1kHz tone (SPL=T0dB) 
which produces the same roughness as 2 tones, modulated with 
mrji=0,8 (f'| = 500Hz; f2*1kHz),as a function of SPL-| (SPL 2 = T0dB). 
Open circles: co-phasic, closed circles: anti-phasic modulation. 

The degree of modulation m^ of the 1kHz tone for equal rough- 
ness is plotted as a function of SPL-| of the lower tone (f-| = 
500Hz; SPL2=70dB; mi«=0,8) . 

For anti-phasic modulation (closed circles) ,the roughness re- 




246 



Vogel: ROUGHNESS 

mained constant/ while for co-phasic modulation (open circles) 
the roughness decreases with decreasing SPL-| , whereby a level 
difference of lOdB has no significant influence. For SPL'|=30dB 
and 40dB the roughness of the sound is nearly equal to the 
roughness of a single modulated tone. 



3. Discussion 

The fact that the roughness of a single amplitude modulated 
tone decreases, if partially masked, is an indication that the 
fluctuation of the whole psychoacoustical excitation pattern 
is analyzed for the sensation of roughness. Fig. 6 shows the 




Fig. 6: Excitation level ~ critical band rate - pattern of 
narrow band sounds of various frequencies and various SPL's. 

The hatched range characterizes the fluctuation range of a 
modulated tone (m=0,5)* 

EXCITATION LEVEL - CRITICAL BAND RATE ~ PATTERN for sinusoidal 
tones of various frequencies and various SPL*s which is de- 
duced from the masked threshold of a narrow band noise (simpli- 
fied diagram ) . For amplitude modulation it can be assumed that 
this excitation level fluctuates according to the degree of 




247 



Vogel: ROUGHNESS 

modulation approximately at each place about the same amount, 
neglecting the nonlinearity of the upper slope (in fig. 6 the 
hatched range belongs to m=0,5 as an example) . If the upper 
slope is masked by an unmodulated tone, the range of fluctuation 
diminishes and with that the roughness too. This hypothesis can 
also explain that the roughness of a modulated tone of high 
level decreases with increasing carrier frequency, for the up- 
per slope of the excitation pattern is "masked" by the end of 
the audible frequency range. Just as loudness is composed of 
specific loudnesses (ZWICKER u. FELDTKELLER 1967) , the rough- 
ness r seems to be composed of specific roughnesses r* . The 
relative temporal fluctuations of the excitation at the point 
is relevant for the specific roughness r^, nearly independ- 
ent of the absolute excitation level at z^. The experiments 
with two amplitude modulated tones show that the total rough- 
ness is the sum of the specific roughnesses with regard to the 
phase of the modulation, therefore it seems to be reasonable to 
apply the following rule 

r(t) ~ (Zft)dz (2) 

where r(t) is the total roughness depending on time t, and 
r*(z,t) is the specific roughness depending on critical band 
rate z and on time t. 

As consequence of this hypothesis (the whole EXCITATION - CRIT- 
ICAL BAND RATE - TIME - PATTERN contributes to the roughness) , 
an amplitude-modulated sound, excitating equally the whole 
range of the critical band rate, could be at best twice as 
rough as an amplitude-modulated 1 kHz tone of SPL=80dB, which 
excites only half of this range (fig. 6) . This was controlled 
by the following experiment: 

20 sinusoidal tones - each with a SPL=60dB - were spread over 
the range of audibility in such a way that each critical band 
(2 - 21 Bark) contains one tone. The tones were not phase- 
locked. This sound was amplitude-modulated with f^^^=50Hz and 
m^=0,5. The subjects had to compare the roughness of this sound 
with the roughness of an amplitude-modulated 1kHz tone of SPL= 




248 



Vogel: ROUGHNESS 

80dBy equal fmod variable degree of modulation m^. 

As a result, a m^=0,S was found for equal roughness- That means, 
using equation (1), that the modulated sound is by a factor of 
2,5 more rough than the 1 kHz tone, modulated with equal degree 
of modulation. This result is in good accordance with the pos- 
tulation, deduced from the hypothesis that the sensation of 
roughness is produced by the complete EXCITATION - CRITICAL 
BAND RATE - TIME - PATTERN. 



Appendix 

In Fig. 3, the influence of an unmodulated partially masking 
tone on the roughness of an amplitude modulated 1 kHz-tone is 
shown. For SPL=80dB this influence is clear. For SPL=60dB, 
however, the excitated range of critical band rate is so narrow, 
that the decrease of roughness could only be measured if the 
frequency distance is less than 3 Bark. However, it seems not 
reasonable to reduce the distance below Az=3 Bark, because 
interferences between the masking tone and the modulated 1 kHz- 
tone may influence the results. 

To avoid this effect, another experiment was carried out with 
subjects having a strong hearing-loss. Fig. 7 shows the thresh- 
old in quiet of two subjects with a "damage” at about 4 kHz. 
Instead of masking the roughness by an additional tone, the mo- 
dulated tone was situated just below this "installed low-pass". 
Thereby "masking" of the upper slope of the excitation pattern 
is obtained without additional interferences. Besides, the at- 
tention of the subject was not disturbed by the masker. The 
roughness of the partially masked tone (m«i«=0,8) was compared 
with the roughness of a 1 kHz-tone modulated with m^- The SPL 
of the 1 kHz-tone was adjusted for equal loudness, as it is 
marked by the dashed and dotted line in Fig. 7. The definition 
for the distance Az is marked in this figure, too. The results 
of this experiment for a SPL=60dB and a modulation frequency 
fmod“^^^^ are shown in Fig. 8/a, b. The expected decrease of 
roughness with decreasing Az can be seen for Az=3Bark. 




249 



Vogel: ROUGHNESS 




Fig. J: Threshold in quiet of two subjects with a strong 
hearing-loss at about 4kHz. 



For comparison r this experiment was also carried out with a 
high-pass noise (HPN) with a cut-off frequency of 2kHz as mask- 
er. Thus, the individual threshold had no influence on the 




Fig. 8: Degree of modulation m^ (related to m,p) of the unmasked 
tone, which produces the same roughness as a partial masked 
tone modulated with mgi=0,8, as a function of the distance z 
of the masker (SPL=60dB, fj^Q^=50Hz). The masker was: 

"threshold in quiet" (8/a,b), high-pass noise with f^=2kHz 
(8,c) and low-pass noise with f^=500Hz (8,d). 

result (Fig.8,c) . According to the case of ±2 < f-j in the ex- 
periment described in section 2.2 (Fig. 3) , a low-pass noise 
(LPN) with a cut-off frequency of 500Hz was used as masker. 
Therefore the results in Fig.8,d) show the decrease of rough- 
ness if the lower slope of the excitation pattern is masked. 

In these two additional experiments (Fig.8/c,d), the unmasked 




250 



Vogel: ROUGHNESS 

tone modulated with had the same carrier and modulation 
frequency (fjj^Q^=50Hz) as the partially masked tone (miji=0/8; 
SPL=60dB) . The SPL of the noise was chosen in such a way that 
the level, corresponding to one critical band, was 60dB. 

All the results shown in Fig. 8 confirm the hypothesis that the 
entire psychoacoustical excitation pattern contributes to 
roughness. The nonlinearity of the upper slope is clearly 
represented by the different masking effect using different 
SPL*s. Supposing that a dependence of roughness on SPL is only 
produced by the dependence of the excitated number of critical 
bands on SPL, the order of this dependence can be predicted by 
Fig. 6. As consequence of a SPL-increment of 20dB, the roughness 
should increase nearly by a factor of 2. This is in very good 
accordance, too, with the results known from psychoacoustical 
experiments . 



Acknowledgement 

This work was carried out within the Sonderforschungsbereich 
Kybernetik, Miinchen, supported by the Deutsche Forschungsge- 
meinschaft. 

References 

Terhardt a E. (I 968 a), "tiher die durch amplitudenmoduli erte 

Sinustone hervorgeruf ene Horempf indung , ” Acustica 
20 , 2 IO- 21 U. 

Terhardt a E. (I 968 h ) . "tiber akustische Rauhigkeit und Schvan- 
kungss tarke a " Acustica 20, 215“22U. 

Terhardt, E. (19T^)* ”0n the Perception of Periodic Sound 

Fluctuations (Roughness),” Acustica 30, in press. 



Zwicker, E., and Feldtkeller, R. (196T)» Das Ohr als Nachrich- 
tenempf anger (S. Hirzel Verlag, Stuttgart). 




251 



TRANSIENT MASKING PATTERN OF NARROW BAND MASKERS 

H. FASTL 

Ins ti tut fiir Elektroakustik der TU Miinchen^ FRG 

I . INTRODUCTION 

Transient masking patterns of narrow band maskers can be con- 
sidered as a measure for the temporal as well as spectral re- 
solving power of the ear. The well-known effects of forward and 
backward masking represent the limitations of temporal resolu- 
tion, whereas the masking pattern as function of testtone fre- 
quency characterizes the spectral resolution. Thus, backward, 
simultaneous and forward masking patterns of a critical-band 
and a sinusoidal masker, respectively, were determined by the 
conventional threshold method. In particular, the influence of 
masker bandwidth on the transient masking pattern will be dis- 
cussed. In addition, pulsation threshold patterns of both si- 
nusoidal and critical-band "masker” are compared. 

2. METHOD AND PROCEDURE 

The threshold of short tone impulses , masked by a narrow band 
noise (Af=1800 Hz) and a sinusoid at 8.5 kHz, respectively, was 
measured by 9 observers with "normal" hearing. A slightly modi- 
fied method of Bekesy-tracking was used, i.e. masker impulses 
with and without testtone impulse alternated. The masker im- 
pulses were separated by pauses of at least 500 ms; the presen- 
tation was monaural through earphones (BEYER DT48S) with a 
free-field correction network (Zwicker und Feldtkeller 1967). 
Testtone and masker impulses had Gaussian rise and fall, unde- 
sired modulation products were cancelled out by 1 /3-octave-band 
filters . 

3. RESULTS 

In all figures, the sound pressure level of the testtone im- 
pulses is given as function of their frequency f^ or critical- 
band-rate z. Lm represents the level of the continuous tone 
(re 2-10 N/m ) , out of which the testimpulse with the dura- 




252 



Fasti: TRANSIENT MASKING PATTERN 
tion T^ was cut out. 

The transient masking pattern of a critical-band masker at 21.5 
Bark is represented in Fig. 1 . The sound pressure level of the 
masker was L*, = 70 dB^ its duration T.. = 500 ms, the risetime 
of the Gaussian gating signal was t^ = 0.5 ms. The testtone im- 
pulses had a duration T^ = 1 ms with t^ = 0.5 ms; their sound 
pressure level is given as a function of the critical-band- 
rate z as well as the time t. For negative values of t occurs 
backward masking, 0.<t<500 ms indicates simultaneous masking and 
t>500 ms refers to forward masking. The medians of more than 
1000 single results were connected by smooth curves, plotted in 
Fig. 1 . 




Fig. 1 : Transient masking pattern of a critical-band masker im- 
pulse at 21.5 Bark 

= TO dB; = 500 ms; = 1 ms; t^ = 0.5 ms 

The masking pattern displayed in Fig. 1 suggests the following 
statements : 

1) The backward masking pattern is very similar to the simul- 
taneous masking contour. 

2) The simultaneous masking contour is almost independent of 
time; a development cannot be recognized. 



253 



Fasti: TRANSIENT MASKING PATTERN 

3) The forward masking pattern shows steeper slopes than both 
simultaneous and backward masking pattern. 

A similar experiment was performed with a masking impulse cut 
out of a sinusoid. The masker frequency was 8.5 kHz = 21 .5 
Bark, the sound pressure level = 70 dB and T^^ = 200 ms. The 
testtone impulses had the duration T^ = 2 ms , the risetime of 
the Gaussian gating signal was t^ = 1 ms. The smooth curves de- 

dL 

picted in Fig. 2 are based on more than 1000 single results, 
too. 




Z ft. 



Fig. 2: Transient masking pattern of a sinusoidal masker im- 
pulse at 21.5 Bark 

= TO dB; Tj^ = 200 ms ; = 2 ms ; t = 1 ms 

The observations described above for a critical-band masker 
hold as well for a sinusoidal masker. However, two principal 
differences can be noticed: 

1 ) The transient masking pattern of the sinusoidal masker shows 
steeper slopes than the pattern of the critical-band masker. 

2) The masking produced by a sinusoidal masker is inferior to 
that produced by a critical-band masker. The whole transient 
masking pattern of the sinusoid lies 10 dB below the pattern 




254 



Fasti: TRMrSIENT MASKING PATTERN 

of the critical-band masker, which is hard to explain merely 
by the different testimpulse durations of 1 ms and 2 ms, re- 
spectively . 



A detailed representation of the forward masking pattern of 
both critical-band and sinusoidal masker is given in Fig. 3. 

The medians with interquartiles of at least 18 threshold values 
of 9 observers are shown. The delay-time t^ between masker ter- 
mination and the end of the testtone impulse was 3 ms, 11 ms 
and 31 ms for critical-band masker and 5 ms, 10 ms and 50 ms 



for sinusoidal masker (circles 
tively) . The crosses represent 
testimpulses with frequency f^ 
band-rate z. 



6 7 8 9 kHz 13 




19 20 21 22 Bark 24 






squares and triangles , respec- 
the threshold in quiet of the 
corresponding to the critical- 



6 7 8 9 kHz 13 




Fig. 3: Forward masking pattern of cr it i cal-band and sinusoidal 
masker impulse at 8.5 kHz, respectively. 

a) = 70 dB ; = 500 ms; = 1 ms; t^ = 0.5 m's 

t^ = 3 ms, 11 ms, 31 ms 

b) = TO dB; T^^ = 200 ms; = 2 ms; t^ = 1 ms 

t^ = 5 ms , 10 ms, 50 ms . 




255 



Fasti: TRANSIENT MASKING PATTERN 

As can be seen in Fig. 3, for long delay time, the maximum of 
the forward masking contour is shifted towards higher frequen- 
cies . 

It is tempting to compare these maximum shifts with results of 
Vogten (1972 Fig. 4), which indicate a frequency dependence of 
maximal masking as function of masker level, because both 
shifts amount to a few percent and tend towards the same direc- 
tion. 

Anyhow, the most important result of our experiments is that a 
masking sinusoid elicits a narrower masking pattern than a cri- 
tical-band masker. 

Recently, data concerning the frequency selectivity of the ear 
have been provided by means of the "pulsation threshold" (see 
Houtgast 1972). A tone-impulse is presented alternately with a 
"masker" impulse. At a critical sound pressure level L^, the 
interrupted tone sounds as if it were continuous. This level 
is defined as pulsation threshold; it can be measured by the 
same methods as the masked threshold. 

By the method of Bekesy-tracking, we determined the pulsation 
threshold pattern of a tone and a critical-band noise at f^^ = 
1850 Hz, respectively, with a sound pressure level of Lj^=70dB. 
The time pattern of the presentation is indicated by the insert 
in Fig. 4. Both sounds had a duration of 100 ms and a Gaussian 
rise and fall of 10 ms. Open triangles represent the pulsation 
threshold pattern of the narrow band noise, open circles refer 
to the 1850 Hz sinusoid (9 observers, medians with interquar- 
tiles of 18 threshold values, respectively). The filled tri- 
angles indicate the conventional masking pattern of the conti- 
nuous critical-band noise (from Fasti 1972 Fig. 1), the filled 
circles represent results of Houtgast (1972 Fig. 6), trans- 
ferred into critical-band-rate scale and shifted from 8.5 Bark 
(1 kHz) to 12.5 Bark (1850 Hz). 




256 



Fasti: TRANSIENT MASKING PATTERN 




Fig. k: Comparison of pulsation threshold pattern and conven- 
tional masking pattern 

open triangles: pulsation threshold pattern of a critical- 

hand noise at 12.5 Bark 

conventional masking pattern of a critical- 
hand noise at 12.5 Bark 
(from Fasti 1972 Fig. 1) 

pulsation threshold pattern of a sinusoid 
at 12.5 Bark 

pulsation threshold pattern of a sinusoid 
transferred from 8.5 Bark to 12.5 Bark 
(from Houtgast 1972 Fig. 6) 



filled triangles; 

open circles: 
filled circles : 



The data presented in Pig. 4 suggest: 

1) For a critical-band noise, the conventional masking method 
and the pulsation method lead to almost the same pattern 
(filled and open triangles) 

2) The pulsation method yields a narrower pattern for a sinus- 
oid than for a critical-band noise, too (open circles and 
triangles) 





257 



Fasti: TRMTSIENT MASKING PATTERN 

3) The pulsation threshold seems to depend distinctly on indi- 
vidual observers (see Houtgast 1972 Table I); a frequency 
effect cannot be ruled out (open and filled circles) . 

4. SUMMARY 

The transient masking pattern of a sinusoidal as well as a 
critical-band masker did not suggest a development of the fre- 
quency selectivity of the ear with a time constant of about 
10 ms (Scholl 1962, Elliott 1967). A masking sinusoid elicited 
a narrower masking pattern than a critical-band noise. This 
could be proved for conventional masking method as well as for 
pulsation threshold method. For a critical-band masker, pulsa- 
tion method and conventional masking method yielded the same 
pattern. 

In physiological experiments, sinusoidal sounds are preferred. 
Thus, differences between physiological correlates to sinusoids 
and narrow band noise, respectively, can hardly be compared at 
present with psychophysical data. However, the well established 
masking pattern of narrow band maskers (cf. Zwicker 1973) makes 
related physiological data desirable. 

ACKNOWLEDGEMENT: The author is indebted to G. Kauth for execu- 
ting part of the experiments with sinusoidal masker. The inves- 
tigations were supported by the Deutsche For s chungs gemeins chaf t . 

REFERENCES 

Elliott, L.L. (1967) Development of auditory narrow band fre- 
quency contours, J. Acoust. Soc. Amer . 1^3 

Fasti, H. (1972) Temporal effects in masking, in: Symposium on 
Hearing Theory, IPO Eindhoven 

Houtgast, T. (1972) Psychophysical evidence for lateral inhibi- 
tion, J. Acoust. Soc. Amer. 5J.5 1 885 
Scholl, H. (1962) Das dynamische Verhalten des Gehors bei der 

Unterteilung des Schallspektrums in Frequenzgrup- 
pen, Acustica J_2, 101 

Vogten, L.L.M. (1972) Pure-tone masking of a phase-locked tone 
burst, in: IPO Annual Progress Report X 

Zwicker, E. (1973) Temporal effects in psychoacoustical exci- 
tation, in: Basic Mechanisms in Hearing, Ed. 

Miller Academic Press, New York 

Zwicker, E. und Feldtkeller, R. (I967) Das Ohr als Nachrichten- 
empfanger, Hirzel Verlag Stuttgart 




258 



MASKING PATTERNS AND LATERAL INHIBITION 
T.HOUTGAST 

Institute for Perception TNO, Soesterberg, the Netherlands 



Summary 



The main issue of this contribution is a comparison among several types 
of psychophysical masking patterns in the light of one specific electrophys- 
iol ogi cal "fact" about the neural coding of a masker's sound spectrum, name- 
ly lateral inhibition. Masking experiments were performed in a number of 
different ways, in which the temporal presentation of masker and test tone 
was a main variable. We essentially found two types of results: one type, in 
those cases where the test tone was superimposed on the masker (direct mask- 
ing), which consistently shows n£ effects related to lateral inhibition, and 
a second type, when masker and test tone were presented nonsimultaneously 
(e.g., forward masking), which consistently does show effects related to 
lateral inhibition. Thus, it would appear that only this latter type of 
masking pattern is closely related to the neural coding of a masker's sound 
spectrum. 

The concept of an "auditory projection" of a stimulus' sound spectrum 
plays an important role in auditory theories. Although it is difficult to 
give a proper definition of this "auditory projection", it is generally felt 
that it should reflect the effects of the first stages of auditory proces- 
sing and, consequently, agree with electrophysiol ogi cal 'facts' on the neu- 
ral coding of a sound at the lower levels of the auditory pathway. We were 
interested in effects of lateral inhibition or lateral suppression, as re- 
vealed by several electrophysiol ogi cal studies, in the auditory projection 
of a sound spectrum. 

A traditional method in psychoacoustics for investigating the auditory 
projection of a sound is by means of masking experiments with a pure tone as 
test signal. Commonly, it is assumed that a masking pattern mirrors the au- 
ditory projection of a masker's sound spectrum. Accordingly, we performed 
masking experiments with various maskers which seemed interesting from the 
point of view of lateral suppression. It appeared that the results did de- 
pend essentially on the type of masking experiment: in direct masking, when 
the test tone is superimposed on the masker, no effects of lateral suppres- 
sion were found, whereas in nonsimultaneous masking (forward masking or 
pulsation threshold, see below) clear effects of lateral suppression were 
obtained (Houtgast, 1974a). Some examples will be given for various types 
of maskers. 




259 



Houtgast: MASKING PATTERNS AND LATERAL INHIBITION 



1. MEASUREMENTS 



1.1. METHODS 

Three types of masker-and-test-tone experiments will be considered , 
which differ markedly in the temporal presentation of masker and test tone. 

a. Direet masking. The test tone was superimposed on the masker and the 
threshold value referred to the detectability of the test tone. 

b. Forward masking. A brief test tone was presented shortly after the ter- 
mination of the masker and, again, the threshold value referred to the 
detectability of the test tone. 

c. Fulsation-threshold method. In this case, 125-msec masker bursts and 
125-msec test-tone bursts were presented alternately, without silent in- 
tervals. The threshold value referred to the way in which the series of 
test-tone bursts were perceived: either as a gulsating tone in accordance 
with the alternation rhythm (for high test-tone levels) or as a contin- 
uous tone (for low test-tone levels). The subject's task was facilitated 
by leaving out each fourth test-tone burst, such that 'pulsating' cor- 
responded to the perception of series of three short test-tone bursts, 
whereas 'continuity' corresponded to the perception of relatively long 
test- tone bursts with a one-sec cycle. 

Thresholds were measured according to three difference procedures. 

a. Adjustment. The subject controlled the dependent variable and was in- 
structed to adjust it to threshold value. 

b. Bekesy up-down. In this procedure, as in the previous one, the measuring 
condition was repeated on and on. The dependent variable was changed 
automatically and the subject could only determine the direction of that 
change, by pushing or releasing a button. He was instructed in ac- 
cordance with the usual up-down tracking procedure and the average value 
of the dependent variable during a fixed period was considered the thres- 
hold value. 

c. 2-AFC up-down. The stimuli were presented in a sequence of trials, each 
consisting of two observation periods. The masker was identical in both 
periods, and the test tone was presented in only one of the two periods 
(randomly). After each trial, the subject had to indicate which period 
contained the test tone. After each incorrect decision , the dependent 
variable was changed in the direction for which detectability improved, 
and after two successive correct decisions it was changed in the op- 
posite direction. The average value of the dependent variable during a 
fixed period was considered the threshold value. 

The experiments were performed monaural ly by means of a Beyer DT-48 
headphone. All levels refer to the electrical signal fed to the telephone. 
Pure-tone levels are expressed in dB relative to the level of a 1000-Hz 
200-msec pure tone at hearing threshold. Noise is characterized by its spec- 
tral level HqI the intensity in 1-Hz intervals in dB relative to the inten- 
sity of a lOuO-Hz 200-msec pure tone at hearing threshold. In the various 
schematic diagrams of the spectral and temporal composition of the stimuli, 
the test tone is always indicated by an interrupted line. Furthermore, the 
single-pointed arrow always indicates the dependent variable, and the dou- 
ble-pointed arrow(s) the independent variable(s), set by the experimenter. 




260 



Houtgast: MASKING PATTERNS AND LATERAL INHIBITION 




Fig. 1. Masking experiments on tone-on-tone suppression. The shaded 

areas in the two lower data graphs indicate that the addition 
of the second tone reduces the masking effectiveness at 
1000 Hz. (Average of two subjects.) 

1,2. WO-TONE MASKER 

Thl s experiment is illustrated in Fig. 1. The left column of data 
graphs refers to a single 1000-Hz masker with a variable level Li (abscissa) 
and a 1000-Hz test tone with level Lj (dependent variable), which were 
presented in phase. Subsequently, L| was fixed at 40 dB and the right 
column of data graphs indicates the effect of the addition of a second mask- 
ing tone with a level of 60 dB and a variable frequency f£ (abscissa). It 
will be seen that, only in forward masking and with the pulsation-threshold 
method, the addition of the second tone may result in a considerable reduc- 
tion of the masking effectiveness at 1000 Hz (hatched areas in the data 







261 



Houtgast: MASKING PATTERNS AND LATERAL INHIBITION 

graphs). This is typical of the operation of a lateral suppression mechanism 
and fits in with effects of two-tone inhibition in primary auditory neurons 
(Sachs and Kiang^ 1968; Arthur et al . , 1971). Additional experiments further 
underlined the similarity between the results of pulsation-threshold meas- 
urements on two- tone maskers and neural data on two-tone inhibition (Hout- 
gast, 1973). 

Thi s experi ment i s 1 11 ustrated in Fig. 2. It is very similar to the pre- 
vious one except that, instead of adding a second tone to the 1000-Hz mas- 
ker, white noise was added at various spectral levels Nq. Again, the results 
obtained in forward masking and with the pulsation-threshold method indicate 
that the addition of the noise may reduce the masking effectiveness at 1000 
Hz (hatched areas in the data graphs). This strongly suggests an effect of 
noise-on-tone suppression and may be related to an observation of Kiang 
(1965), that the addition of noise to a tone at a fibre's best frequency 
may cause a reduction of the firing rate. More extensive experiments on 
noise-on-tone suppression are presented elsewhere (Houtgast, 1974b). 

1.4. WISE MASKER OF VARIABLE BMPmDTE 

This experiment is illustrated in Fig. 3. The noise band was obtained by 
carrier-suppressed modulation of a 1000-Hz carrier with low-pass filtered 
noise (48 dB/oct). The bandwidth B and spectral level Nq were the inde- 
pendent variables, and the level of the 1000-Hz test tone was the dependent 
variable. The aspect of primary interest here is the effect of bandwidth B 
on the masking effectiveness in the centre of the noise band. The direct- 
masking data are not in conflict with the traditional concept of a 'critical 
band' with a width of about 160 Hz (at 1000 Hz). However, the pulsation- 
threshold data strongly suggest the operation of a lateral suppression mech- 
anism and may be related to similar effects observed in the response of neu- 
rons in the cochlear nuclei to variations in noise bandwidth (Greenwood and 
Goldberg, 1970). 

1.5. RIPPLED-NOISE MASKER 

This is an example taken from an extensive series of experiments on 
rippled-noise maskers (Houtgast, 1974a). A 1000-Hz test tone was used with 
a fixed level Lj, and the masker was noise of which the intensity as a func- 
tion of (linear) frequency was shaped sinusoidally ( see Fig. 4 ). Inde- 

pendent variables were the peak-to-valley ratio D (param^eter) and the rela- 
tive ripple density 1000/Af (abscissa). The noise level Nq was the dependent 
variable. Each data point gives the difference between the two conditions 
'tone in valley' and 'tone at peak', and it will be seen that, generally, 
this difference deteriorates when ripple density increases. Besides the ob- 
vious quantitative difference between the degree of ripple resolution re- 
vealed by the two different masking paradigms, the results also show an 
interesting qualitative difference in case of rippled noise with a reduced 
peak-to-valley ratio (D=5.7 dB). Only the curve in the right panel (pulsa- 
tion threshold) shows a maximun of around d=2 or 3, which goes beyond the 
'physical' modualtion depth of 5.7 dB. This maximum is very typical of the 
operation of a lateral suppression mechanism and may be related to measure- 
ments of ten Kate et al . (1974) on single-unit responses in cochlear nuclei 
of the cat to these rippled-noise stimuli: only for smaller values of 
D (11 dB, 5.7 dB or 3.1 dB), the peak-valley differences show a maximum for 
a relative ripple density (unit's CF/Af) of 3 to 4. 




262 



Houtgast: MASKING PATTERNS AND LATERAL INHIBITION 






FORWARD MASKING 





Fig. 2. Masking experiments on noise-on-tone suppression. The shaded 
areas in the two lower data graphs indicate that the addition 
of the noise reduces the masking effectiveness at 1000 Hz. 
(Average of two subjects.) 











263 



Houtgast: MASKING PATTERNS AND LATERAL INHIBITION 





t — ^pure tone masker {90® phase) 




Fig. 3. Masking experiments with a lOOO-Hz test tone located in the centre 
of a band of noise with variable spectral level Nq (parameter) and 
variable bandwidth B (abscissa) . The lower graph indicates that the 
masking effectiveness decreases when B is widened beyond a value of 
about 150 Hz. (Average of two subjects.) 








264 



Houtgast: MASKING PATTERNS AND LATERAL INHIBITION 



ILt 

tone at peak: 

JLt 

tone in valley: 

1.0 (kHz) Im. t 

DIRECT MASKING PULSATION THRESHOLD 

(adjustment) (adjustment) 





Fig. 4. Masking experi- 
ments with a fixed 
1000-Hz test tone and 
a rippled-noise masker. 
Each data point presents 
the threshold difference 
between ’tone in valley’ 
and ’tone at peak’. 

(Data of one subject.) 

0 2 4 6 8 10 12 0 2 4 6 8 10 12 

relative ripple density d=1U00/Af 

2. DISCUSSION 

The examples presented here illustrate that neural data on lateral sup- 
pression are readily revealed by masking experiments, provided the masking 
effectiveness in the suppressed frequency region is probed with a test tone 
presented nonsimul taneously. This suggests, if we accept a qualitative sim- 
ilarity between the hearing mechanisms in man and cat (from which most neu- 
ral data are obtained), that nonsimul taneous masking techniques give a cor- 
rect picture of the auditory projection of a masker's sound spectrum, in- 
cluding effects of lateral suppression, whereas traditional direct-masking 
methods do not. 

The absence of effects of lateral suppression in direct masking can be 
understood by realizing that, for a superimposed test tone, both the masker 
and the test tone are subjected to the same effect of lateral suppression. 





265 



Houtgast: MASKING PATTERNS AND LATERAL INHIBITION 

such that, apparently, the S/N vatio in the suppressed frequency region is 
not affected. The essential difference between a superimposed or a nonsimul- 
taneous test tone suggests that the effect of lateral suppression stoos al- 
most immediately at the end of the masker, such that the effectiveness of 
the nonsimultaneous test tone is not reduced in the same way as the masker. 
Thus, when the degree of lateral suppression in the test-tone frequency 
region changes, affecting only the masker and not the test tone, the test- 
tone level has to be changed “externally" in order to remain at threshold. 
This is precisely what we measured in nonsimultaneous masking. It should be 
noted that this reasoning does not imply any specific mechanism for forward 
masking or pulsation threshold. The only requirement is that the threshold 
condition is associated with neural processes at a level in the auditory 
pathway beyond the stage where lateral suppression operates. In case of 
forward masking, one may think of a process of decay of, or recovery from, 
neural effects caused by the preceding masker, and in case of pulsation 
threshold, when masker and test tone are alternated continuously, of a oro- 
cess of continuity of the response in the neural region corresponding to the 
test- tone frequency (see also Houtgast, 1974a). 

Briefly, the experimental results indicate that lateral suppression con- 
tributes substantially to the preservation of spectral contrasts in the au- 
ditory projection of a sound spectrum and, furthermore, that nonsimultaneous 
masking, in contrast to traditional direct maskina, gives a correct picture 
of this auditory projection. 

REFERENCES 

Arthur, R.M., Pfeiffer, R.R. and Suga, N. (1971). "Properties of 'Two-Tone 
inhibition* in Primary Auditory Neurons," J. Physiol. 212 ^ 593-609. 
Greenwood, D.D. and Goldberg, J.M. (1970). "Response of Neurons in the 

Cochlear Nuclei to Variations in Noise Bandwidth and to Tone-Noise Com- 
binations, " J. Acoust. Soc. Amer. 47, 1022-1040. 

Houtgast, T. (1973). "Psychophysical Experiments on 'Tuning Curves' and 
'Two-Tone Inhibition'," Acustica 29 ^ 168-179. 

Houtgast, T. (1974a). Lateral Suppression in Bearing; A Rsychophysical Stu- 
dy on the Ear^s Capability to Preserve and Enhanee Spectral Contrasts 
(Institute for Perception TNO, Soesterberg, the Netherlands). 

Houtgast, T. (1974b). "Lateral Suppression and Loudness Reduction of a Tone 
in Noise," to be published in Acustica. 

Kate, J.H. ten, Bilsen, F.A. , Raatgever, J. and Buunen, T.J.F. (1974). 
"Single Unit Responses in Acoustic Nuclei of Cat to Noise and Its At- 
tenuated Repetition," submitted to 8th I.C.A., London. 

Kiang, N.Y.S. (1965). discharge Patterns of Single Fibers in the Cat^s Au- 
ditory Nerve (M.I.T. Press, Cambridge, Mass.). 

Sachs, M.B. and Kiang, N.Y.S. (1968). "Two-Tone Inhibition in Auditory- 
Nerve Fibers," J. Acoust. Soc. Amer. 43, 1120-1128. 




266 



COMMENTS ON: Masking patterns and lateral inhibition (T. HOUTGAST) 

F.A. BILSEN 

Applied Physics Department, Delft University of Technology, Netherlands. 



One of the masker stimuli investigated by Houtgast, viz. the "rippled 
noise" (or "repetition noise" (Boerger, this symposium) or "noise with its 
repetition after a delay x" (Bilsen et al, 1970)), has been the main sti- 
mulus in electrophysiological experiments by ten Kate et al. (1973, 1974) 
on the neural coding of Repetition Pitch-stimuli. In their experiments the 
average spike rate SR(x) of cochlear nucleus units in cat is registrated 
with a PDP 8-computer as a function of the delay time x (x equals "relative 
ripple density " or "harmonic number"). In fig. 1 such a registration SR(x) 
from a "chopper" neuron is represented. The spike rate has relative maxima 
at x= n/CF (CF is the characteristic frequency of the unit; n is a positive 
integer) . 




Fig.l Response of a cochlear nu- 
cleus unit GC 6 to noise and its 
attenuated repetition: SR(x)(left) 
for g=-5dB. Upper right: the 
weighting function W(f) calcula- 
ted as the average FFT of several 
SR(i) recordings for different 
g-values and SPL*s, corrected 
with a linear SR-INT (dB) relation 
as measured (8 spikes/dB sec.). 
Lower right: iso-intensity curve 
for tone in noise (both at 40 dB 
SPL): SR(f). 




x= 3/CF is a typ'ical re- 



The appearance of an absolute maximum at about 



suit observed for many other units. This phenomenon might be related to Hout 



gast’s psychophysical analogue (see his fig. 4). Further, one might speculate 



on 




267 



COMMENTS 



its implication for spectral dominance in pitch perception (Bilsen et al, 
1970; ten Kate et al. 1973). 

Fig. 1 also represents the unit’s response to a tone in white noise. 

The spike rate SR(f) is recorded as a function of the frequency of the 
tone, for isointensity of both tone and noise. Clearly, suppression can be 
observed on the high frequency side. This finding might be compared with 
Houtgast’s fig. 2. 

Applying linear systems theory (allowed to a certain extent for reasons 
not elaborated here) a weighting function W(f) can be calculated from SR(t) ; 
it characterizes ’’peripheral filtering” up to the level of the cochlear nu- 
cleus. With (f)(a3,T)= 1+g cos o)t being the power spectrum of the acoustical 
stimulus, the output power P(t) of the ’’filter” equals „ 

OO c» 

P(x)=o/(l+gcos u)t) W(a))do3= c+g/ W(a))coso3T dm . 

With W(o3)= W(-(i)) the integral on the right is recognized as a fourier inte- 
gral. Using the unit’s spike rate as a function of noise level in dB, P(t) 
can be calculated from SR(t) . Finally, W(f) is obtained from P(x) by inverse 
fourier transformation. Thus, the function W(f) in fig. 1 is the result of 
fast fourier transformation (FFT) on the PDP 8- computer. It may be compared 
to SR(f). 

REFERENCES 

Bilsen, F.A., Ritsma, R.J. (1969/70); Repetition Pitch and its implication 
for hearing theory. Acustica 63-73. 

Kate, J.H. ten, Bilsen, F.A. and Raatgever, J. (1973), Spectral properties 
of single unit responses in the cochlear nucleus to noise with its 
repetition. Delft Progr.Rep. 17-24. 

Kate, J.H. ten, Bilsen, F.A. , Raatgever, J. and Buunen, T.J.F. (1974), Single 
unit responses in acoustic nuclei of cat to Noise and its attenuated 
repetition. Accepted for 8 th I.C.A., London. 




268 



ADDITIONAL REMARKS 

EVANS: Clarification of confusion can be obtained if we distinguish 
between ’’lateral suppression” at cochlear nerve with that at higher 
levels of the system , e.g. cochlear nucleus. Physiologically, lateral 
suppression at cochlear nerve level is a much weaker phenomemon than 
lateral inhibition in the dorsal cochlear nucleus. Neural correlations 
with the forward masking and pulsation threshold effects are found at 
the cochlear nucleus and not at the cochlear nerve level. Thus, using 
comb-filtered noise, no evidence of lateral suppression effects are 
found at the cochlear nerve (Evans and Wilson, 1973), whereas they are 
found in the dorsal cochlear nucleus (our unpublished observations; 
ten Kate , et al. ) . 

WILSON: It would appear that the cochlear nerve level where the phenomenon 

of two tone suppression is found (Sachs and Kiang, 1968) is not the most 
appropriate for comparison with the inhibitory effects that you observe 
psychophysically . We have performed four different types of experiment 
which indicate that the kind of signals that you have been considering do 
not produce such effects in cat cochlear nerve fibres. (1) Comb-filtered 
(rippled) noise stimuli produce exactly the (dB) contrast/ripple density 
function predicted on linear filter theory with no indication of lateral 
inhibitory effects (Wilson and Evans, 1971; Evans and Wilson, 1973). (2) 

The relative thresholds for white noise and for a tone at the CF are again 
exactly as predicted (see Evans and Wilson, 1973). (3) The response to 
a sharply defined 4kHz band of noise is as predicted at all frequency 
positions with no ’’edge effects” (Evans, Rosenberg and Wilson, unpublished 
observations). (4) The response to a special rippled noise spectrum with 
lower peak to valley ratios set to maximise the possibility of two- tone 
suppression failed to reveal such an effect (Wilson, Evans and Rosenberg, 
1974). 

REFERENCES 

Evans, E.F., Wilson, J.P. (1973). ’’The frequency selectivity of the coch- 
lea,” in: Basic Mechanisms of Hearing (ed. ) A.R. M?511er, A.P. , N.Y., 
519-551. 

Wilson, J.P., Evans, E.F. (1971). ’’Grating aciuty of the ear: psycho- 
physical and neorophysiological measures of frequency resolving 
power.” Proc. 7th ICA, Vol. 3., 397-400. 

Wilson, J.P., Evans, E.F. , Rosenberg, J. (1974). ’’Linearity of the cochlear 
nerve fibre filter response : a test for the Influence of two-tone 
suppression.” Proc. 8th ICA. 




269 



THE SLOPES OF MASKING PATTERNS (General Comments) 

T. HOUTGAST 

Institute for Perception TNO, Soesterberg^ the Netherlands 



The paper by Fasti ("Transient Masking Patterns of Narrc^ Band Mask- 
ers") is an intersting contribution to this symposium^ showing the effect 
of masker bandwidth on the transient masking pattern. Besides this main 
issue, the paper also provides information on a topic which may deserve 
some further attention: a comparison between the shape of a masking pat- 
tern of a narrow-band noise measured in direct masking or with the -pulsa- 
tion-threshold method. The experimental data given in the paper suggest 
that "For a critical -band noise, the conventional masking method and the 
pulsation method lead to almost the ssm pattern ....". The intriguing 
aspect of this result is that it seems to be in conflict with the gener- 
al framework of direct-masking data and pulsation-threshold data as pre- 
sented in the contribution to this symposium by Houtgast ("Masking Pat- 
terns and Lateral Inhibition"). The reasoning is simple. Basically, 
Houtgast 's results suggest that (1) the internal representation of a 
stimulus' sound spectrum is subjected to lateral suppresion, operating 
mainly in the direction from higher towards lower frequencies, and (2) 
a conventional masking pattern does not reflect this effect of lateral 
suppression, whereas a pulsation-threshold pattern does so. Within this 
framework, one would expect the low-frequency slope of a masking pattern 
measured with the pulsation-threshold method to be steeper than that 
measured in direct masking. Apparently, this framework and the result of 
Fasti are in conflict. 

I think it is a natural weakness to stick to one's own framework as 
long as possible (and perhaps even longer). Therefore, before questioning 
the validity of our framework, we performed some experiments directed ex- 
plicitely to this question of a comparison between the low-frequency 
slopes of masking patterns obtained by the two methods (direct masking 
versus pulsation-threshold method). 




270 



Houtgast: THE SLOPES OF MASKING PATTERNS 



METHOD AND RESULTS 

The masker was a narrow-band noise with a very steep slope at the low- 
frequency side (the spectral level in dB/Hz is presented in each data 
graph). Three different masker levels were considered, with an over-all 
RMS level of 42, 57 and 72 dB, respectively. (All levels refer to the 
electrical signals fed to a Beyer DT-48 telephone; 0 dB refers to the 
intensity of a lOOO-Hz 200-msec tone at hearing threshold.) The experi- 
ments were performed monaurally with five subjects. Fig. 1 illustrates 
the stimulus presentation for the two methods. 

DIRECT MASKING 
(adjustment) 




PULSATION THRESHOLD 
(odjustmentl 




Fig. 1 , Temporal pattern of the presenta- 
tion of the masker (hatched) and the test- 
tone f in case of direct masking and in 
case of the pul sat ion- threshold method. 



Direct masking. The masking noise was presented continuously. Test-tone 
bursts were presented in series of four on, four off, four on, etc. The 
test- tone frequency fj was the independent variable and the test-tone 
level the dependent variable. The subject adjusted Ij to that value at 
which the test-tone bursts could gust be perceived. For each condition, 
each subject made two such adjustments in separate sessions. 
Pulsation-threshold method. The noise and the test- tone were presented in 
continuous alternation. Again, the test-tone bursts were presented in 
series of four on, four off, etc. The subject adjusted Lj to that value 
at which the pulsating character of each series of four successive test- 
tone bursts could gust be perceived. Again, in each condition two such 
adjustments were performed in separate sessions. 

Since our primary interest was a comparison between the results of 
the two methods, a direct-masked threshold and a pulsation threshold were 
always obtained in immediate succession for each value of fj (in the ad- 
ditional session the order was reversed). 




271 



Houtgast; THE SLOPES OF MASKING PATTERNS 




jgpjpajip sj uonD^ind 




[gpjpajip 5 j uOi|OSind 




(gpH::aj»p Si uoi|05ind 



Fig. 2 , Lower 'panels: direct-masking pattern (closed symbols) and pulsation-threshold pattern (open 
symbols) of the band of noise of which the spectral level in dB/Hz is indicated. (Average data of five 
subjects.) Upper panels: Individual data on the differences between the pulsation threshold and the 
direct-masked threshold. 









272 



Houtgast: THE SLOPES OF MASKING PATTERNS 

The results obtained with the three different masker levels are pre- 
sented in Figs. Za, bjC, respectively. The lower panels indicate the 
(averaged) masking patterns for direct masking (closed symbols) and pul- 
sation threshold (open symbols). The values of the slopes in dB/Bark are 
derived from the upper part of the skirt. Individual data on the dif- 
ferences between the pulsation threshold and the direct-masked threshold 
are presented in the upper panels. 

DISCUSSION 

The data indicate a systematic difference between the low-frequency 
slope of a masking pattern measured in direct masking or with the pulsa- 
tion-threshold method: the latter appears to be considerably steeper. This 
result is both amusing and frustrating. Amusing, because in the light of 
the introduction one might easily be inclined to associate this result 
with son^ kind of "wishful experimenting", though this very suggestion 
itself would of course undermine a whole field of psychophysics. The 
frustrating aspect is more serious. Apparently, very similar and rather 
simple experiments may still yield such different results when performed 
in different laboratories. There is no obvious reason for this discrepan- 
cy. Hence, at this stage, we can go no further than noting that the ques- 
tion whether the low-frequency slopes of a direct-masking pattern and a 
pulsation-threshold pattern are essentially the same or different remains 
open. 

REFERENCES 

Fasti, H. (1974). "Transient Masking Pattern of Narrow Band Maskers", 
this Symposium. 

Houtgast, T. (1974). "Masking Patterns and Lateral Inhibition", this 
Symposium. 




273 



COMMENTS ON: "The slopes of masking patterns" (HOUTGAST) 
H. FASTI 

Ins ti tut fur Elektroakustik der TU Munchen 



The main result of Houtgast's experiment is that "the data indicate a 
systematic difference between the low-frequency slope of a masking pattern 
measured in direct masking or with the pulsation - threshold method: the 

latter appears to be considerably steeper." 

We did a quite similar experiment, the results of which are plotted in 
Fig. 1. The masker was a critical band noise at 2 kHz as indicated by the 
hatched area with an over all SPL of 60 dB. Pulsation thresholds Lj 
(triangles) as well as direct masked thresholds Lj (circles) are given as 
function of the test tone frequency fj corresponding to the critical band 
rate z. Each symbol represents the median of 12 threshold adjustments 
performed by 6 observers. The time pattern applied was essentially the 
same as described in Houtgast's comment, however, the test-tone bursts were 
not presented in series of four on four off and with respect to the pul- 
sation method masker and signal had no temporal overlap. To minimize the 
danger of "wishful experimenting," direct masked threshold and pulsation 
threshold at a given f^ were not obtained in immediate succession, but the 
whole direct masking contour and pulsation contour were measured in dif- 
ferent runs of our experiment. 

The results depicted in Fig. 1 seem to confirm our statement that the 
conventional (direct) masking method and the pulsation method lead to 
almost the same pattern. The small differences of few dB between pul- 
sation data and direct masking data show up in the same direction as 
indicated by the results of Houtgast. On the other hand, however, the 
differences do not exceed significantly the accuracy of measurement. 




274 



COMMENTS 




Fig. 1: Comparison between direct masked thresholds 

(circles) and pulsation thresholds (triangles) 




275 



A CRUDE QUANTITATIVE THEORY OF BACKWARD MASKING* 

H. DUIFHUIS 

Institute for Perception Research, Eindhoven, The Netherlands 

I . Intvoduot'ion 

Backward masking is the phenomenon that an intense masker 
masks a weak and brief probe that precedes the masker. Usually 
it has been attributed to the notion that in the auditory path- 
way the neural response to a louder signal propagates faster 
than the response to a weak signal. Therefore, the response to 
the masker would overtake the probe response, thereby masking it 
As argued before on a largely qualitative basis (Duifhuis 1972a, 
b) , the acute peripheral frequency analysis can already cause 
temporal overlap of probe and masker, due to transient effects 
of peripheral filtering (transient masking, Duifhuis 1973) . A 
further quantitative analysis of the peripheral contribution to 
backward masking appeared desirable. Recently this analysis 
has been worked out, using a descriptive model of the peripher- 
al auditory system (Duifhuis , 1973) . The present state of the 
modelling art implies that the theory will be a crude, but 
hopefully a significant first approximation. The main lines of 
the theory are presented in this paper, together with some ten- 
tative supporting neurophysiological data. 

2. Outline of the theory 

Assume that the peripheral ear, from auricle up to primary 
auditory nerve, can be represented by a bank of linear band- 
pass filters followed by probabilistic transducers. The outputs 
of the filters provide the stimulating waveforms for the recep- 
tor cells and we assume that they contain all relevant frequen- 
cy selectivity that is observed in primary units. The frequency 
selectivity can be described reasonably well with two constant 
filter slopes of SI and S2 dB/oct for low- and high-frequency 
slope, respectively (e . , Siebert , 1968 ; Goldstein et al, ,1^11) . 
We are interested in the shapes of the impulse responses of 
these filters (particularly the envelopes thereof) . Assuming a 




276 



Duifhuis: A QUANTITATIVE BACKWARD MASKING THEORY 

minimum phase filter, the envelope shape follows from the slopes 
and depends almost exclusively on the sum of the slopes S = 

SI + S2. From tuning curve data from cat ( Goldstein et al. , 
1971) and guinea pig (Evans , 1972) , the following approximative 
relation can be found between S and characteristic frequency: 

S = 120/f __ dB/oct, with f^„ in kHz and S2 2 x si . An approxi- 
mative description of the probabilistic transducers is given by 
a rectifier followed by a nonhomogeneous Poisson process. Let 
us further assume that the neural events ('i.e. spikes represent" 
ed by delta impulses) at the outputs of the Poisson processes 
are measured with a leaky integrator having a time constant t, 
and measuring across a number of fibers, N. Let the output of 
the integrator be L (L=L(t)). Then, because of the central limit 
theorem, L will be distributed approx, normally (of Eq.5) and 
we can calculate expected value, E{L(t)}, and variance of L, 
0 ^{L(t)}, in response to any stimulus. The model is depicted 
schematically in Fig.l. 




Fig.l. Band-pass filter and probabilistic transducer in channel i, followed 
by a leaky integrator which counts activity across N fibers. a(t) : acoustic 
stimulus; h^Ct): impulse response of the i-th filter from the filterbank; 
ii(t): stimulating waveform to the i-th probabilistic transducer; Pj^: spon- 
taneous rate; r^Ct): rate-function for the Poisson process; z^Ct): activity 
in the i-th channel (nerve fiber) ; L(t) : output from the leaky integrator. 

A convenient index for the performance in a 2AFC experiment 
(see Siebert, 1965 ) with alternatives probe + masker and masker 
alone (related to d* : Q = d*^), is: 




277 



Duifhuis: A QUANTITATIVE BACKWARD MASKING THEORY 



{E{L(t I PROBE + MASKER) } - E{ L ( t [mASKER ALONE) } } 
%{a^{L(t| PROBE+ masker)} + a^{L ( t| MASKER ALONE) }} 



For the 75% correct response threshold in a 2-interval 2AFC 
experiment, one has Q-1 (Siebert, 1965 ; Green and Swets,1966) 
for stationary Q. Here Q is a function of t, with a global max- 
imum at about the value of t that maximizes the numerator of 



Eq. 1. The theory predicts the lowest possible probe threshold 
Apth if it is assumed that the probe is detected at the maximum 
of Q. In other words, best performance is predicted when 
putting 

Q(t^) = 1, with |S 0, (2) 

where we assume that the detection criterion Q = 1 applies also 
to nonstationary stimuli. On the basis of Eqs. 1 and 2 one 
readily obtains fo^ each given stimulus configuration (see 

Sec. 4) . The threshold in quiet is obtained by putting masker 
amplitude A^^= 0 . It is determined by spontaneous activity and 
by the noisy representation of the probe. An increment in 
follows if Aj^^ 0 and probe and masker responses of the integra- 
tor overlap. The first order effect on Eq. 1 is an increase of 
the denominator with constant numerator. Then Eq. 2 requires an 
increase of the numerator, i,e. an increase of The thresh- 

old increment of the probe, which is caused by the presence of 
the masker, can be compared with psychophysically observed 
backward masking in a 2-interval 2AFC experiment. Figure 2 
shows a comparison of prediction and data for a tone-burst 
probe and click masker. The masker was presented at 50 dB SL 
(details of the experiment are given in Duifhuis , 1973) . The 
agreement is reasonable insofar as orders of magnitude are 
concerned. 



3. Some theoret'Ccat cons'iderat'tons 

The presented theory is not merely a description of psycho- 
acoustical data. Using a description of the peripheral physiol- 
ogy and applying detection theory to the neural representation 
of the stimulus, psychophysical data are explained in terms of 
physiological observations. 

Combined with the Q=1 detection criterion, the leaky inte- 




278 



Duifhuis: A QUANTITATIVE BACKWARD MASKING THEORY 




Fig. 2. (a) Nonsimultcineous masking data. Click masker and tone-burst probe 
are indicated in the top left insert. The ordinate shows the threshold shift 
of the probe for different intervals between masker and probe onset, at the 
abscissa. The parameter indicated in the figure is the probe frequency fp. 

A 2-interval 2AFC paradigm was used. Masker level: 50 dB SL. (b) Theoretical 
masking curves for the same stimulus with fp as parameter. S follows: S = 
120/fp dB/oct. Other parameter values: T= 2 ms; number of effective fibers 
N = l; average spontaneous activity p = 25 spikes/s. 

grator processes the neural rate information contained in the 
spikes almost optimally (in the sense of making the smallest 
number of errors in detection of the probe) . The processing is 
optimum insofar as time window (represented by x) and frequency 
window (the number of fibers N) are reasonable approximations 
of time- and frequency windows matched to the probe response 
(of. Van Trees, 1968; Siebert , 1968 , 1972 ) . Optimum processing sets 
ultimate limits to performance: actual performance can only be 
worse. If a description in terms of optimum processing leads to 
a reasonable prediction of psychophysical data, then we can 
conclude that the significant limitation in the auditory infor- 
mation flow occurs peripherally to the optimum processing. On 
the basis of Fig. 2 I believe this to be the case with backward 




279 



Duifhuis: A QUANTITATIVE BACKWARD MASKING THEORY 



masking. But to the extent that (1) agreement between theory 
and data is not complete, and that (2) t and N do not exactly 
represent matched time- arid frequency windows , some room may be 
left for an additional more central contribution to backward 
masking. This, of course, is equivalent to non-optimum pro- 
cessing of information from auditory nerve level. 

For mathematical convenience (Sec. 4) I used a linear recti- 
fier and omitted refractoriness, adaptation and saturation in 
the probabilistic transducers (Fig.l). At low levels these ef- 
fects are small, and therefore they produce only a small error 
in the tails of the masking curve. In vicinity of the top, how- 
ever, the effect of this simplification may lead to a signifi- 
cant underestimation of the predicted threshold shift (Duifhuis, 
1973) . 



4. Some mathematical details 



With a linear halfwave rectifier and a linear low-pass filter (leaky in- 
tegrator) , the system of Fig.l behaves more or less like a linear envelope 
detector. Let probe and masker be given by 

M: M(t) =A^*6(t) . (3) 

The envelope of the impulse response of the band-pass filter h(t) is approx. 

(Goldstein, per s. comm.) n-^-l 

2(aifit) ^ -a^f^t 

h. (t) = e t>0 

1 fn^i 



P: P(t) = Ap*p(t) *sin 27Tfpt , 



S. ^ ° ^2 

withni=^-l and a.= ijs./ 12 ) 



t < 0 



(4) 



(and S^= 120/f^) . 



If at t^^ the j-th neural event occurs in fiber i, then the output of the 



integrator, L(t) , is given by 

L(t) = \ exp{-(t^ .-t)/x}. 
irj ^ 



(5) 



Given the Poisson processes with rate functions r.(9) , Campbell's theorem 

ft 



leads to 



and 



E{L(t)} = I 



r^(9) e 



a^L(t) } 



ft 

I 



( 6 ) 



(9) e 



Substituting h. (0) for r. (0) one obtains 



E{L(t|M)} = 



and a^{L(t|M) } = A^*N-l 2 (t) , 



(7) 




280 



Duifhuis: A QUANTITATIVE BACKWARD MASKING THEORY 



where N represents the summation across N effective fibers and I^ (t) and 
I^Ct) are incomplete gamma- function integrals. In case of no stimulus we 
have r^(0) = = constant , and 

E{L(t|sPONT)} = N*p*T , 0 ^{l (t | SPONT) } = N*p*^T . (8) 



Similarly one obtains for the probe 



E{L(t|p)} = Ap*N'p(t)5CIj^(t) = Ap'WI^Ct) 
and a^{L(t|p)} = Ap*N*p(t)ii;i 2 (t) = Ap-N*I^(t). 



(9) 



Because of 'linearity', Q now becomes 

{ApN I^Ct) 

Q = 

» 2 A N I (t) + N p *5T + a n I (t) 
P 4 M2 



( 10 ) 



For a given the Q=1 criterion thus gives a function of t, pro- 

vided that N, p, and T are known. For p, the average spontaneous activity, 
we used 25 spikes/s; T appeared to give a best fit at about T = 2 ms. The 
number N could be as low as 1; increase to N = 10 would decrease the masking 
at the top of the curve by some 3.5 dB. Neither parameter value is very 
critical. For a more detailed discussion the reader is referred to Duifhuis 



(1973) . 



5, Phys'iologiaal evidenoe 

In one specific experiment set up at the Eaton-Peabody labo- 
ratory we looked for a possible correlate of auditory backward 
masking in single auditory-nerve fiber responses. The theory 
presented here predicts such a correlate, and moreover, being 
largely a physiological theory, it requires physiological vali- 
dation. The fragmentary results to be shown here cannot be con- 
sidered to give this validation, but they may be considered 
some supporting evidence. 

Recordings were made from single auditory-nerve fibers in 
one cat. Probe and masker stimuli were tone-bursts at the unit*s 
CF. The envelopes of the tone-bursts had a smooth 2.5 ms rise/ 
fall time and a duration of 10 ms (measured between half-ampli- 
tude points). Probe and masker carrier were in phase. 

Figure 3 shows an example of the results for probe - masker 
intervals At = 2 ms and At = 5 ms, and for different probe levels, 
in the form of PST histograms. For comparison also the responses 




281 



Duifhuis: A QUANTITATIVE BACKWARD MASKING THEORY 




Fig. 3. PST histograms of a 
unit in response to a back- 
ward masking stimulus for 
different probe levels and 
probe - masker intervals 
(At = 2 and 5 ms) . Probe 
and masker are 10 ms tone- 
bursts at CF= 0.6 kHz. 

Also are shown the re- 
sponses to probe alone 
(right column) , and to 
masker alone (bottom) , the 
latter at At = 2 ms . Probe 
onset is the same in all 
cases, the masker onset 
shifts with At. The stim- 
ulus configuration is 
indicated schematically 
in the lower part of the 
figure. Spontaneous activ- 
ity: 30.5 /s. Fiber 
threshold: -90 dB. Masker 
level: -55 dB if present, 
probe level as indicated. 
Reference level: 200 V 
peak-to-peak into conden- 
sor earphones. 

PST histograms are deter- 
mined relative to the 10 
per second stimulus mark- 
er. Length of run: 30 s. 



to probe alone and masker alone (at At = 2 ms) are given. Compar- 
ing the response to probe + masker with masker alone at At = 2ms, 
no significant response to the probe is observed for probe level 
-85 dB. The same probe level does elicit a response in absence 
of the masker. Thus, in the presence of the masker the probe is 
masked. As in psycho-acoustical backward masking, increases in 
probe level, as well as increases in probe - masker interval, 
have the effect that the probe response becomes more pronounced. 
This was true for all studied units. 



The above effect seems to indicate a latency effect, but note 
that the decrease in average response latency with increasing 
level is not the result of a shift of the response (probe alone 




282 



Duifhuis: A QUANTITATIVE BACKWARD MASKING THEORY 

responses in Fig. 3). It is the result of the appearence of 
earlier peaks in the response, whereas the positions of the 
other peaks are relatively fixed. The same observations are 
found in Kiang et al. (1965) . Of particular interest are the PST 
histograms in response to clicks as a function of click level 
(e.g. their Fig. 5. 7, p 40/41). This is of relevance to back- 
ward masking in that a loud masker evokes a neural discharge 
earlier than a weak probe. Therefore, when being presented to- 
gether, the neural response to the earlier probe may not be 
elicited before the response to the masker. Because the masker 
is strong compared to the probe, the total response will be 
determined by the masker and the response to the probe will not 
be detectable. 

The latency effect mentioned above can be interpreted in 
terms of peripheral frequency selectivity. The tuning curve 
selectivity and its linear-like relation to response delay 
(Goldstein et al. ,1911) predict smooth onset and of f set transients 
of a click response. The gradual onset implies that at a higher 
click level the firing threshold is exceeded earlier. At the 
same time, effects like refractoriness and adaptation relative- 
ly reduce the response to subsequent peaks (Gray, 1967) . There- 
fore, although the stimulating waveform (i.e. the oscillatory 
click response) may be invariant, the resulting PST histograms 
will show the occurrence of earlier peaks with increase in 
level (Duifhuis, 1972a, Fig. 6 . 20 ) . The fact that individual peaks 
are not shifted, however, does not readily allow the concurrent 
interpretation in terms of a "response propagation time" that 
decreases with increase in level. These considerations suggest 
a cautious interpretation of response latency. 

It would be premature to base conclusions on the outcome of 
a single experiment. However, the presented data being consis- 
tent with the Kiang et al. (1965) observations referred to above, 
and fitting in the general scheme of the theory, may be con- 
sidered a tentative piece of evidence that backward masking has 
a peripheral origin. 




283 



Duifhuis: A QUANTITATIVE BACKWARD MASKING THEORY 
6, Conclusion 

A crude quantitative theory on backward masking was presented 
in this paper. The theory attributes a significant part of the 
origin of backward masking to the auditory periphery (particu- 
larly peripheral filtering) . The theory enables an order of 
magnitude interpretation of psychoacoustical data in terms of 
a description of neurophysiological observations (as boiled 
down in a model of the peripheral auditory system) . Tentative 
results of a single direct neurophysiological experiment are 
consistent with this interpretation, but do not readily allow 
the usual interpretation in terms of a response propagation time. 

A aknow ledgemen ts 

Comments on earlier versions of this paper from B.L.Cardozo 
and C . A. A. J .Greebe , and discussions with J . L . Goldstein on the 
theoretical part were very helpful. 

The author thanks N.Y.S.Kiang for the opportunity to do the 
described physiological experiment at the Eaton-Peabody labora- 
tory. He is indebted to D. Johnson for his major role in the 
execution of the experiment. Discussions with and comments from 
both and from W.M.Siebert provided valuable contributions to 
the latter part of this study. 

This study received support through a stipend from the Nether- 
lands Organization for the Advancement of Pure Research (Z.W.O.) 
and from N.I.H. grants 5 POl GM 14940-07 and 5 TOl GM 01555-07. 

Footnote 

Significant parts of this study have been developed during my visit to 
the M.I.T. Research Laboratory of Electronics (Aug. ' 72-Aug. ' 73 ) . 

References 

Duifhuis, H. (1972a). Rerceptual Analysts of Sound. Doctoral dissertation, 
Eindhoven University of Technology. 

Duifhuis, H. (1972b). "Peripheral Aspects of Non simultaneous Masking", 

Symposium on Hearing Theory 1972, held at the Institute for Perception 
Research, Eindhoven, June 1972. 

Duifhuis, H.(1973). "Consequences of peripheral frequency selectivity for 
nonsimultaneous masking", J.Acoust. Soc.Amer. 54 , in press. 

Evans, E.F.(1972). "The frequency response and other properties of single 




284 



Duifhuis: A QUANTITATIVE BACKWARD MASKING THEORY 



fibres in the guinea-pig cochlear nerve", J. Physiol. 22^ ,263-287 . 

Goldstein, J.L. , Baer, T. , and Kiang, N. Y. S. (1971 ) . "A theoretical treatment 
of latency, group delay and tuning characteristics for auditory nerve 
responses to clicks and tones", in: Fhys'tology of the aud'itoTy system , 
M.B. Sachs, Ed. (National Educational Consultants, Baltimore MD) . 

Gray, P.R. (1967) . "Conditional probability analysis of the spike activity of 
single neurons", Biophys.J.7, 759-777. 

Green, D.M. , and Swets, J. A. (1966). Signal detection theory and 'psycho- 
physics, (Wiley, New York). 

Kiang, N.Y.S., Watanabe, T. , Thomas, E.C., and Clark, L.S.(1965). 

Discharge patterns of single fibers in the cat^s auditory nerve 
(M. I. T. Press, Cambridge MA) . 

Siebert, W.M. (1965) . "Some implications of the stochastic behavior of 
primary auditory neurons", Kybernetik 2 , 206-215. 

Siebert, W.M. (1968). "Stimulus transformations in the peripheral auditory 
system", in: Recognizing patterns, P.A.Kolers and M.Eden, Eds (M.I.T. 
Press, Cambridge MA) . 

Siebert, W.M. (1972) . "What limits auditory performance?", Proc. 4th Confer- 
ence of the International Union of Pure and Applied Biophysics, Moscow. 

Van Trees, H.L. (1968) . Detection^ Estimation^ and Modulation Theory ^ Pt.I 
(Wiley, New York) . 



ADDITIONAL REMARKS 

EVANS: (1) How do you account for the difference between your conclusion 

and that of Watanaba and Simada (1971)? (2) Is your experiment of Fig. 3 

conclusive? The weaker response appears to be already saturated. It 
would have been better to have worked below saturation. 

DUIFHUIS: (1) See my 1973 J.A.S.A. paper. (2) It is indicative. We 

worked at a masker level of 40 dB above the units threshold (from TC) 
which was intended to be below saturation level but still giving sig- 
nificant masking, but apparently we had saturation. 




V. Nonlinear Effects 




287 



NONLINEAR EFFECTS IN THE TRANSIENT RESPONSE OF THE BASILAR MEMBRANE. 

L. ROBLES AND W.S. RHODE 

Department of Neurophysiology, University of Wisconsin, Madison, 
Wisconsin, USA. 

INTRODUCTION 

Three new techniques to measure sub-microscopic mechanical 
vibrations have been applied to the cochlea in recent years. They are 
the Mossbauer technique^^^ (Johnstone and Boyle, 1967; Rhode, 1971), 
the capacitive probe (Wilson and Johnstone, 1972) and the technique of 
fuzziness-detection under laser illumination (Kohlloffel, 1972a). These 
techniques, much more sensitive than the optical method used by von 
Bekesy (i960), have made it possible to measure the oscillations of the 
basilar membrane in live animals at physiological levels of stimulation. 

Even though the three techniques have been used in preparations 
which were quite different, the results obtained show basic points of 
agreement (Johnstone ^ al_. , 1970; Rhode, 1971; Wilson and Johnstone, 
1972; Kohlloffel, 1972b). The results for the three methods confirmed 
most of von Bekesy 's observations about the pattern of vibration of the 
cochlear partition, but the "resonance curve" for the motion of the 
basilar membrane was considerably sharper than that obtained by von 
Bekesy (I960). 

Among the discrepancies in the results obtained by the different 
groups of investigators, the most important because of its physiological 
implications is the nonlinearity in the motion of the basilar membrane 
reported by Rhode (1971). While the other groups report linear behavior 
of the basilar membrane, Rhode ,using the Mossbauer technique, describes a 
nonlinearity limited to frequencies close to the "characteristic 
frequency" for the region being measured. 

The present work reports on the results obtained in measure- 
ments of the transient response of the basilar membrane of the squirrel 

(l) The Mossbauer technique is a method that can be used to measure 
very small velocities (0.2 to 2 mm/s) utilizing the Doppler shift of 
gamma radiation (Frauenfelder , 1962). 




288 



Robles and Rhode; BASILAR MEMBRANE TRANSIENT RESPONSE 

monkey using the Mossbauer technique. The primary aspects of the 
transient response that are analyzed In this report are the level of 
damping and the linearity of the response. 

METHODS 



The results discussed In this paper were obtained In a series 
of experiments performed In 33 squirrel monkeys. The surgical approach 
to the cochlea and the equipment to register the gamma radiation were 
the same used by Rhode (1971). The animals were anesthetized with 
sodium pentobarbital. The eighth cranial nerve was exposed using a 
dorsal approach and, after cutting one of the branches of the vestibular 
nerve, a small opening (less than 0.5 mm In diameter) was drilled Into 
the sea la tympanl. A Mossbauer source 5 um thick and about 60 ym x 60 
ym was placed on the basilar membrane with the aid of a glass pipette. 
After the placement of the source, the seal a tympanl was refilled with 
physiological saline and the opening was covered with a small piece of 
thin plastic. 

The stimuli used were acoustic condensation clicks, about 150 ys 
In duration, generated by a 1/2 Inch condenser microphone and delivered 
In front of the eardrum by means of a closed sound coupler. The stimuli 
were monitored by a 1/8 Inch condenser microphone located Inside the 
coupler about 1/4 Inch from the eardrum. Removal of the monitor micro- 
phone and sealing of the sound coupler by a small plastic window allowed 
measurement of the response of the malleus. The stimuli were presented 
In sequences of 100,000 to 400,000 clicks for each stimulus Intensity, 
while the Instantaneous velocity of the radioactive source was recorded 
by a LINC computer In the form of a post-stimulus time histogram (PST) 
of the gamma radiation. 

RESULTS 



The click response of the basilar membrane measured with the 
Mossbauer technique at the only location along the membrane that could 
be reached due to surgical restraints was always a damped oscillation 
with a natural frequency of about 7 KHz. 




289 



Robles and Rhode: BASILAR MEMBRANE TRANSIENT RESPONSE 



There were large differences in the shape and duration of the 
responses recorded in different animals. Figure 1 displays the 
transient response for three animals, illustrating quite clearly the 
large differences observed in the responses. Each one of the responses 
shown in the figure is a PST histogram of the gamma radiation collected 
for 100,000 or 200,000 stimulus repetitions. The oscillations of the 
membrane appear in the histograms as rectified oscillations of the number 
of gamma rays counted per bin. 





Fig. 1: PST histograms of gamma radiation for click responses in 

different animals. (a) Acoustic click used as stimulus, 
measured with the 1/8 inch condenser microphone in the sound 
coupler, (b) through (d) click responses obtained in three 
different animals. Due to the symmetry of the Mossbauer 
characteristic, the oscillations of the membrane appear 
rectified in the histograms. The lower level of counts 
corresponds to zero velocity. Note the clipping in the initial 
oscillations, especially in histograms b and d. 



The initial cycles of all the responses shown in Fig. 1 appear 
clipped due to the limited dynamic range of the Mossbauer technique. 
This clipping of the larger velocities in the PST makes it more 




290 



Robles and Rhode; BASILAR MEMBRANE TRANSIENT RESPONSE 

difficult to estimate the damping of the responses^ however the very 
slow decrease In amplitude of the oscillations In the tall of some of 
the responses (as In parts b and d In Fig. 1) show that at least for 
some animals the recorded responses were very lightly damped. It must 
be pointed out that even the more highly damped click responses 
obtained In these experiments had a considerably lower logarithmic 
decrement than the value of 1.4 to 1.8 previously estimated for the 
cochlear partition (von Bekesy, I960). 

One of the most consistent effects observed In the transient 
response was a pronounced nonlinear behavior for changes In click 
Intensity. The nonlinearity was characterized by a less than propor- 
tional decrease In amplitude of the later part of the damped oscilla- 
tion for a decrease In Intensity of the stimulus. The two or three 
earliest oscillations of the response always seemed to behave In a 
linear fashion. A clear nonlinear effect was observed In 80^ of the 
preparations, and It was found In every animal that had a response with 
a long duration. 

The different behavior of the first two or three cycles com- 
pared with the later cycles of the response for a decrease In Intensity 
of the stimulus Is Illustrated In Figs. 2 and 3. Figure 2 shows PST 
histograms of gamma radiation for click responses obtained In the same 
animal at different values of stimulus attenuation. The responses 
shown in Fig. 3 have been obtained by computing the instantaneous 
velocity of the basilar membrane from the PST histograms and then 
Inverting alternate peaks of the resulting curve to remove the rectifi- 
cation effect Introduced by the Mossbauer characteristic. In both 
figures the amplitudes of the early peaks of the response decrease 
almost linearly, while the amplitudes of the later peaks diminish very 
little for a large decrease In click Intensity (see captions for Figs. 

2 and 3). 

Since the Mossbauer technique Is velocity sensitive, the 
velocity curves (such as the ones shown In Fig. 3) must be Integrated 
to obtain the displacement response of the basilar membrane. Unfortu- 
nately, there are two factors that Introduce some uncertainty in the 




Counts 



291 



Robles and Rhode; BASILAR MEMBRANE TRANSIENT RESPONSE 




Time [ms) 



Fig. 2: PST histograms of gamma radiation obtained at various inten- 

sities in the same animal, (a) acoustic click used as stimulus, 
(b) through (d) click responses obtained using different values 
of click attenuation, as shown alongside the figure. Note the 
drastic reduction in amplitude of the first three half cycles 
of oscillation from histogram b to d. The first half cycle 
(at Tl) in histogram b has disappeared in c and d; the second 
and third ones (at T2 and T3) which are severely clipped in b 
are strongly reduced In c and d. In contrast, the oscillations 
between 2.0 and 3-0 ms have a small change in amplitude from 
histogram b to d, for a 23 dB change In intensity. 




292 



Robles and Rhode: BASILAR MEMBRANE TRANSIENT RESPONSE 



3 


A, 


:■ lit 




_■ 1 1 1 1 1 1 1 1 1 1 1 
il” 

i..... fi ll .’ll 


- f 




V V V V 






■ ' /■•hrn 

: •ai'f 

:\ 


1 V y 


1 , , , , 

d 

A M A. J 


r . - r , , , , - 1 - 


■ y ]l \j y 

: m 

TiTiTi 





’ I ■ ' i ■ ■ ' i I ' I I ‘ ' ' ' ■ ' ^ ■ I ■ ' ' ' ' ' ' ' I ' ' ’ ' ' ’ I 

0 12 3 4 

TIME (ms) 



ATTN, (dB) 
0 



6 



12 



IS 



Fig. 3* Velocity curves for click responses at various Intensitites 
In the same animal. The velocity curves were obtained, from 
the PST histograms of gamma radiation, by computing the 
corresponding absolute velocities from the Mossbauer charac- 
teristic and by inverting alternate peaks of the absolute 
velocity curves at the zero crossings. The velocity In the 
responses has been clipped at a value of - 1.2 mm/s. Note 
that the first half cycle of oscillation (at Tl) In curve a 
reaches the level of clipping and that the two following ones 
(at T2 and T3) are severely clipped. The corresponding half 
cycles In curve d have vanished or are strongly reduced In 
amplitude. All of the later oscillations have a much smaller 
change In amplitude from curve a to d In spite of the 18 dB 
change in stimulus intensity. The click used had a peak SPL 
of 106 dB at 0 dB attenuation. 



results obtained from this Integration: 1) the clipping of the first 

cycles of the response that reach higher velocities and 2) the ambiguity 




293 



Robles and Rhode: BASILAR MEMBRANE TRANSIENT RESPONSE 

with regard to the direction of the basilar membrane displacement 
which is introduced by the symmetric Mossbauer characteristic of the 
radioactive source used. 

The displacement responses shown in Fig. k have been obtained 
by integrating velocity curves c and d in Fig. 3. In order to avoid the 
error introduced by the clipping effect, displacement curves have been 
computed only for low intensity clicks, in which the velocity curves are 
almost free of clipping. The responses shown in the f igure, especial ly 
the one in part a, show a lightly damped oscillation superimposed on a 
slower transient response that, for most of the experiments analyzed, 
had a net displacement in one direction. The resting position of the 




Fig. 4: Displacement 

responses of the basilar 
membrane to click stimu- 
lus at two intensities. 
The two displacement 
curves have been obtained 
by integrating the veloc- 
ity curves c and d in 
Fig. 3* As in the 
previous figure 0 dB 
attenuation corresponds 
to a click with a peak 
SPL of 106 dB. 



basilar membrane, indicated in the figures by a dashed line, has been 
fitted by eye to the center of gravity of the tail of the response. It 
seems to us that this procedure is reasonable, since after a short 
disturbance one would expect the basilar membrane to return to its 
normal position. The difference, in some responses, between the final 
position and the initial resting position of the basilar membrane may be 
due to errors in the integration caused by three factors: 1) a small 




294 



Robles and Rhode; BASILAR MEMBRANE TRANSIENT RESPONSE 

amount of clipping in the first cycles of the response; 2) small 
oscillations at the beginning of the response with peak velocities 
that fall under the velocity threshold of the technique and 3) statis- 
tical error due to the random nature of the measuring technique. 

DISCUSSION 

The wide range of values of damping observed in the different 
animals makes it difficult to give an estimate for the normal logarith- 
mic decrement of the cochlear partition. However, the observation that 
even the more heavily damped click responses obtained in these experi- 
ments had a much lighter damping than that classically assigned to the 
cochlear partition, strongly supports the more recent observations in 
which the frequency response of the basilar membrane appears to be much 
sharper than previously believed (Johnstone and Taylor, 1970; Rhode, 
1971; Wilson and Johnstone, 1972; Kohl loffel , 1972b). 

Furthermore, since one of the most striking results obtained in 
these experiments is the low level of damping observed in some of the 
preparations, one might ask whether these very lightly damped 
responses may not be representative of the click response of the normal 
cochlea. The higher levels of damping observed in some of the prepara- 
tions could then be explained by differences in the physiological state 
of the cochlea, such as might result from impairment of the cochlear 
blood supply due to the drastic surgery required in the preparation. 

The progressive increase in damping of the response observed to some 
degree in all preparations, suggests that the damping of the response 
may be correlated with the physiological state of the cochlea. 

Our results obtained with transient stimuli are consistent 
with the earlier observation that the basilar membrane behaves non- 
linearly with changes in stimulus intensity (Rhode, 1971)* More- 
over, the fact that the first cycles of the transient responses, 
which behave more linearly, are of a lower frequency than the later 
ones, suggests a much closer agreement between both sets of data. 

One would expect that a system having a nonlinearity restricted to 
the frequencies close to the “characteristic frequency" in the steady 
state response (as described by Rhode) would show a stronger 




295 



Robles and Rhode: BASILAR MEHBRANE TRANSIENT RESPONSE 

nonlinear effect at the tall of the transient response which depends 
more heavily on the frequency components around the peak of the 
"characteristic frequency." 

These conclusions about the agreement between the nonlinear 
characteristics observed in these experiments and the nonlinear effects 
reported by Rhode are supported by the predictions of Kim et al . (1973) 
who used a model of basilar membrane motion that includes nonlinear 
damping. The impulse responses obtained with their model for a decrease 
in stimulus intensity show a marked decrease in amplitude in the first 
cycle of the response while there is almost no change in the amplitude 
of the late oscillations, a result very similar to our experimental 
observations. 

It is interesting to note that in recordings of click responses 
from single fibers of the auditory nerve, Pfeiffer and Kim (1972) 
reported little change in the number of peaks of the response for wide 
changes in the intensity of the stimulus. This effect observed in 
their Population I fibers is precisely the one we have described in our 
results as the nonlinearity in the late part of the click response (see 
Figs. 2 and 3)- This similarity, as well as similarities in the shape 
of the responses, seem to indicate that at least some of the nonlinear 
effects observed in the eight nerve responses may be produced by non- 
linearities in the vibration of the basilar membrane. 

In two recent papers evidence is presented suggesting that two- 
tone inhibition, another well studied nonlinear effect, may be produced 
by mechanical events in the cochlear partition. In their recordings 
from phase-sensitive neurons of the anteroventral cochlear nucleus of 
the cat, Rose £t al_. (197^) report nonlinear interactions of two low 
frequency tones. They match the peaks of the period histogram of the 
neural discharges with a complex waveform generated as a nonlinear com- 
bination of the two stimulating tones according to a stated set of 
rules. The fact that good matching can be obtained over a wide range 
of stimulus conditions is interpreted as evidence that the complex 
waveform may actually reflect the mechanical stimulating waveform and 
that the nonlinear process may thus be mechanical. In their report 




296 



Robles and Rhode: BASILAR MEMBRANE TRANSIENT RESPONSE 

Legoulx et al . (1973) describe results that suggest that the two-tone 
Inhibition phenomenon could be caused by an asymmetrical vibratory 
movement of the basilar membrane. 

The results we have obtained for the displacement of the 
basilar membrane in response to a click stimulus, even though they are 
tentative because of the described uncertainties in the integration, 
seem to show an average transient displacement in one direction that 
we believe is toward seal a vest! bull. Such an asymmetry in the dis- 
placement could have important implications with regard to the mechanism 
that produces the reported nonlinearity. For this reason it would be 
most valuable to know the click response to a rarefaction click. Due 
to technical' problems only a few responses using rarefaction clicks 
were recorded in this series of experiments, and there is not enough 
data to draw any conclusions yet. It is worth noting that Stopp (I 969 ) 
recording electric potentials from the cochlea of pigeons observed 
transient responses at the onset and termination of a tonal stimulus 
that in both cases indicated an average displacement of the cochlear 
partition toward scala vestibuli. 

From the preceding discussion we see that there is experimental 
evidence obtained from quite different sources (cochlear microphonic 
potentials, responses of single cochlear nerve fibers and recordings 
from neurons of the cochlear nucleus) that seems to indicate a mechanical 
origin for observed nonlinearities. As noted by one of us (Rhode, 1973), 
the cochlear partition is a complex structure and there may be other 
sources of nonlinearity besides the motion of the basilar membrane. 
Nevertheless, the evidence reviewed here suggests that the mechanics of 
the basilar membrane could explain some of the nonlinearities. 

In any case the close agreement between the nonlinear effect 
obtained in the time domain and the nonlinearity observed by Rhode in 
the frequency domain, together with the results obtained in post-mortem 
investigations (Rhode, 1973) seem to indicate, as we have already pointed 
out (Rhode and Robles, 197^), that the Mossbauer technique does in fact 
properly measure the motion of the basilar membrane. 




297 



Robles and Rhode: BASILAR HEMBRANE TRANSIENT RESPONSE 

ACKNOWLEDGMENTS 

This work contains part of the material presented in a thesis 
submitted by L. Robles in partial fulfillment of the PhD degree at 
the University of Wisconsin. We thank Prof. C.D. Geisler who acted as 
L. Robles' thesis advisor. We also thank Prof. J.E. Hind for his 
comments and suggestions on this manuscript. 

This investigation was supported by Program Project Grant 
NS-06225 from the National Institutes of Health. 



REFERENCES 

von Bekesy, G. (I960). Experiments in Hearing , edited by E.G. Wever 
(McGraw-Hi 1 1 , New York) . 

Frauenfel der , H. (I 962 ). The Mossbauer Effect (W.A. Benjamin, Inc., 
New York) . 

Johnstone, B.M. , and Boyle, A.J.F. (1967). "Basilar Membrane 
Vibration Examined with the Mossbauer Technique," Science 

158 , 389 - 390 . 

Johnstone, B.M., and Taylor, K. (1970). "Mechanical Aspects of 
Cochlear Function" in Frequency Analysis and Periodicity 
Detection in Hearing , R. Plomp and G.F. Smoorenburg, Eds. 

(S i tj thoff , Leiden, The Netherlands), pp. 8l-’90. 

Johnstone, B.M., Taylor, K.J., and Boyle, A.J. (1970). "Mechanics 
of the Guinea Pig Cochlea," J. Acoust. Soc. Amer. 47 , 504-509. 

Kim, D.O., Molnar, C.E., and Pfeiffer, R.R. (1973). "A System of 
Nonlinear Differential Equations Modeling Basilar-Membrane 
Motion," J. Acoust. Soc. Amer. 54 , 1517-1529. 

Kohlloffel, L.U.E. (1972a). "A Study of Basilar Membrane Vibrations 
1. Fuzziness-Detection: A new Method for the Analysis of 

Microvibrations with Laser Light," Acustica 49-65. 

Kohlloffel, L.U.E. (1972b). "A Study of Basilar Membrane Vibrations 
III. The Basilar Membrane Frequency Response Curve in the 
Living Guinea Pig," Acustica 27 , 82 - 89 . 

Legouix, J.P., Remond, M.C., and Greenbaum, H.B. (1973). "Interference 
and Two-Tone Inhibition," J. Acoust. Soc. Amer. 409-419. 




298 



Robles and Rhode: BASILAR MEMBRANE TRANSIENT RESPONSE 



Pfeiffer, R.R., and Kim, D.O, (1972). "Response Patterns of Single 
Cochlear Nerve Fibers to Click Stimuli: Descriptions for Cat," 

J. Acoust. Soc. Amer. 52 , 1669-1677. 

Rhode, W.S. (1971). "Observations of the Vibration of the Basilar 
Membrane in Squirrel Monkeys Using the Mossbauer Technique," 

J. Acoust. Soc, Amer. 1218-1231. 

Rhode, W.S. (1973). “An Investigation of Post-Mortem Cochlear 
Mechanics Using the Mossbauer Effect," in Basic Mechanisms 
in Hearing, A.R. Mi6ller, Ed. (Academic Press , New York) , 
pp. 49 - 63 . 

Rhode, W.S., and Robles, L. (197^). "Evidence from Mossbauer Experi- 
ments for Nonlinear Vibration in the Cochlea", J. Acoust. Soc. 
Amer. (in press), (presented at the George von Bekesy Memorial 
Symposium, Boston, Massachusetts, April 1973). 

Robles, L. ( 1973 ). "Measurements on the Transient Response of the 
Basilar Membrane Using the Mossbauer Effect," Ph.D. Thesis, 

Univ. of Wisconsin. 

Rose, J.E., Kitzes, L.M. , Gibson, M.M., and Hind, J.E. (197^). 
"Observations on Phase-Sensitive Neurons of Anteroventral 
Cochlear Nucleus of the Cat: Nonlinearity of Cochlear Output," 

J. Neurophysiol. 37, 218-253. 

Stopp, P.E. (1969). "The Transient Electric Responses of the Cochlea," 
J. Physiol. 353 - 365 . 



ADDITIONAL REMARKS 

TONNDORF: Your Fig. 4 showed a dc shift, presumably towards sc. tyropani. 
Except for the sign, this is identical to the shift I have regularly 
observed in models in which it is independent of structural properties. 

It simply depends on the point of entrance. J. Hind had seen similar 
dc shifts in Perlman’s guinea pig observations (1951). I believe that 
this is, at least partly, what brings the summating potential about. 

ROBLES: As we have pointed out in the paper, because of the symmetric 
characteristic of the radioactive source used in our experiments, we 
are not absolutely certain about the direction of the movements of the 
basilar membrane. However, assuming that the first movement for the high 
intensity condensation clicks is toward scala t 3 nnpani, we concluded that 
the transient displacement shown in Fig. 4 must be toward scala 
vestibuli. We hope that a new series of experiments we are preparing now, 
in which we will use a radioactive source with isomer shift will give us 
a definite answer to this question. 




299 



NONLINEAl MECHANISMS AND COCHLEAR SELECTIVITY 
J . P . LEGOUIX and M . C . REMOND 

Laboratoire de Neurophysiologie, College de Prance, Paris 



Some recent works have provided evidence for va- 
rious sources of nonlinearities in the ear. They can explain 
not only the occurence of harmonics and combination tones 
observed on CM, but also the mechanism of summating poten- 
tial and of two-tone interference. 

The identification of the structures which are 
responsible for these phenomena is difficult. It can be 
assumed that the middle ear vibrations, as well as the 
hydromechanical processes inside the cochlea, are possible 
sources of nonlinear vibrations at some intensity level 
( Eldredge and Miller, 1971 )• However, the electrical pro- 
cesses which accompany the production of CM might also 
explain the origin of the nonlinear effects observed on 
CM. These various hypothesis have been tested on electri- 
cal models ( Engebretson and Eldredge, 1968 ) and also on 
the vibration of the cochlear partition ( Rhode, 1971; 
Johnstone and Boyle, 1967 ) and on discharges of the audi- 
tory nerve fibers ( Evans, 1972 ). 

Several experimental results seem to indicate 
that the main source of nonlinearity is in the cochlea. But 
it is still a matter of debate that the nonlinearity dis- 
played by CM is reflecting the nonlinear mechanical vi- 
brations or a nonlinear electrical process ( De Boer and 
Six, I960; Durrant and Dallos, 1972 ). 

Because such mechanisms may have various im- 
plications in the cochlear functionning, we attempted, in 
a series of experiments, to determine the respective role 




300 



Legouix 8c Remond; NONLINEAR MECHANISMS 



of electrical and mechanical nonlinearities in the inter- 
ference phenomenon. 

To that end, the modification of interference 
was studied during short spells of hypoxia. As it is well 
known, CM is very sensitive to the lack of O 2 and rapidly 
decreases in amplitude, while its waveform is altered 
and the distorsion products are increased ( Legouix and 
Chocholle, 1957 )• In the same way, SP displays classical 
changes of amplitude and polarity. Since it is not likely 
that mechanical properties of the vibrating structures 
are modified during hypoxia, these various phenomena sug- 
gest that some nonlinearities are related to the electri- 
cal processes generating CM, 

Technique 

The cochlear microphonics were recorded on 
Guinea-pig by differential electrodes located in various 
turns, according to the classical technique. Anesthesia 
was obtained by intraperitoneal injection of ethylure- 
thane. The animal was curarized and artificially venti- 
lated. The stimulus were applied in free field, and inten- 
sity was controlled with a sound probe introduced in the 
ear canal and connected to a sound level meter ( Bruel 
and K jaer ) , 

Results 

1/ Modifications of interference during hypoxia 

The procedure to measure two-tone interference 
was the following, A pure tone, of fixed frequency, was 
used as a test tone and the voltage of the microphonics 
which it produced was measured with precision, A second 
tone was added to the test tone to produce a decrease, 
or a suppression, of the CM provoked by the test tone. 

The amplitude of the CM response was studied with a fre- 
quency analyser in order to avoid the difficulties of 




301 



Legouix 8c Remond; NONLINEAR MECHANISMS 



reading a complex wave on the screen of an oscilloscope. 
Interference was evaluated by calculating the percentage 
of decrease of amplitude of the CM response to the test 
tone by the action of the suppressing tone. 

The variations of interference were measured 
at various instants during the course of moderate hypo- 
xia. Hypoxia was provoked by decreasing the volume of air 
furnished by the respiratory pump. Several parameters, 
as temperature and E.C.G., were recorded to verify the 
physiological conditions of the animal. 

Hypoxia was maintained for short periods of 5' 
or 15* in order to obtain a good reversibility of the CM 
decrease. Both sounds were presented for a few seconds in 
order to avoid fatigue. The action of the suppressing tone 
was showing variations during the course of hypoxia. The 
changes were similar in the 1st and 3i*d turn of the 
cochlea. Typical results were obtained in the first turn 
with a test tone fixed at ICCC Hz, 70 dB SPL. The fre- 
quency of the suppressing tone was fixed at 70CC Hz but 
the results were different according to its intensity. 

a/ Intensity of the suppressing tone less than 

7C dB SPL 

Interference first increased slightly while 
CM was decreasing rapidly. In some instances, at the 
beginning of hypoxia, interference was modified when CM 
had not yet decreased. Later it disappeared totally when 
CM^ vanished ( Pig.lA ). If hypoxia was maintained, inter- 
ference was no longer observable, at least with this sup- 
pressing tone. When respiration was returned to normal, 
the recovery of CM was complete and followed by an over- 
shoot as it is classical. Interference reappeared and 
rose at the same time as CM. 




302 



Legouix & Remond: NONLINEAR MECHANISMS 




Fig. I. Variations of interference compared with varia- 
tion of CM amplitude during a mild hypoxia ( first turn ) . 
The value of interference is the percentage of decrease 
of the CM response to the test tone when the suppressing 
tone is added. CM amplitude is measured peak-to-peak in 
arbitrary units. Test tone 1 kHz. 60 dB SPL. 

A/ Weak suppressing tone: 7 kHz - 60 dB SPL. Interference 
after a slight increase disappears when CM^ vanishes. 

B/ Suppressing tone of relatively high intensity: 80 dB - 
7 kHz. 



b/ Intensity of the suppressing; tone above 

70 dB SPL 



The interference was more important, but remain- 
ed practically unchanged during all the period of hypo- 
xia until the complete disappearance of CM ( Fig. IB ). 
Slight variations, however, were observed sometimes at 
the beginning of hypoxia. In some intermediary cases, 
with suppressing tones slightly above 70 dB SPL, inter- 
ference decreased at first and remained constant after- 
wards ( Fig. 2 ). 

Of course, if the experiment started with a 
weak suppressor tone, when interference had disappeared. 



303 



Legouix 8c Remond; NONLINEAR MECHANISMS 



it was always possible to make it reappear by increasing 
the intensity of the suppressor tone. 

These results seem to indicate that, for weak 
suppressing tones, interference is related to electrical 
nonlinearities which can be modified when the metabolism 
of the generator is altered. Alone the first order CM 
seems to display such nonlinearities. Suppressing tones 
of higher intensities would produce, in addition to the 
electrical, other nonlinearities, probably of mechanical 
origin, since they are unmodified during hypoxia. They 
would dominate the electrical nonlinearities when the sup- 
pressing tone is of a high intensity. 

II/ Interference and summating potential during hypoxia 

In order to determine the possible relations 
which might exist between interference and summating 
potential during hypoxia, interference was calculated as 
described above. The amplitude of SP provoked by a 
10 msec, burst of tone, made by gating the suppressing 
tone, was measured. 

A relatively good correlation between the modi- 
fications of interference and changes of SP was found. 

When the suppressing tone was weak, SP which was negative 
at the beginning of hypoxia, increased in amplitude when 
interference increased. When interference decreased, SP'^ 
decreased also and disappeared at the very moment when 
interference disappeared. When the suppressing tone was 
at a higher intensity, interference was approximatively 
constant and SP remained present during the period of 
hypoxia. Because SP generators, as well as CM generators, 
are progressively depressed by hypoxia, the changes of 
absolute amplitude of SP probably do not reflect exactly 
the nonlinearities. To take account of this decrease, the 
values of SP were corrected by a coefficient correspon- 




304 



Legouix & Remond: NONLINEAR MECHANISMS 



ding to the decrease of CM. In these conditions, a better 
correlation between the modifications of interference and 
the changes of SP was expected to be found ( Pig. 2 ). 
However, the relation between interference and summating 
potential was not very precise, in particular because SP 
displayed changes of polarity which were difficult to 
correlate with interference. 





Pig. 2. Variations of in- 
terference and of SP du- 
ring hypoxia. The test 
tone was 1 kHz, 60 dB SPL. 
The suppressing tone was 
75 dB SPL, 7 kHz. At this 
intensity, interference, 
after a slight increase, 
decreases and remains at 
an approximatively cons- 
tant level. SP shows simi- 
lar variations. The dot- 
ted lines represent the 
value of SP after correc- 
tion in account of the 
CM decrease. A correlation 
between interference and 
SP is noticeable but it 
is complicated by the 
change of polarity of 
SP. SP remains as long 
as interference. 



TIME [mn] 

Discussion 



In spite of the fact that the nonlinear mecha- 
nisms have attracted the attention of many authors, a num- 
ber of questions are still to be answered. The existence 
of nonlinearities in the mechanical vibrations of the 
basilar membrane, is supported by some experimental re- 
sults ( Rhode, 1971 ), hut is not confirmed by others 




305 



Legouix & Remond: NONLINEAR MECHANISMS 



( Bekesy, I960; Johnstone and Boyle, 196? )• However, 
there are indirect evidences for mechanical nonlinearities. 
We reported anteriorly ( Legouix, Remond and Greenbaum, 

1973 ) that there was a striking analogy between the 
effects of suppressing tones and of changes of hydrosta- 
tic pressure in the perilymph. Moreover, we observed 
that high intensity tones, around 120 dB SPL, determined 
a unidirectional displacement of the cochlear partition 
and an hyperpressure in the fluid of scala vestibuli 
( Legouix and Pierson, 1973 )• 

However, the nonlinear effects observed on CM 
may be related to the electrical processes without mecha- 
nical correlates. Different results have been reported 
which support this hypothesis ( Legouix and Chocholle, 

1957; de Boer and Six, I960; Dallos, 1970 ). According 
to Durrant and Dallos ( 1972 ), the electrical nonlinea- 
rities would appear at lower intensities and would, in 
any case, predominate over mechanical nonlinearities, The 
present results seem to confirm the electrical origin of 
the nonlinearities at low levels but they indicate that 
the mechanical ones would dominate at higher intensities. 

Some discrepancies in the results may occur 
according to the phenomenon which is taken as an index 
for the nonlinearity. It is not very clear if SP and 
interference are related to the same fundamental mecha- 
nism. According to Durrant and Dallos ( 1972 ), SP is 
more likely dependent upon electrical transduction mecha- 
nisms because the electro-mechanical nonlinearities ap- 
pear to be significantly more asymmetrical than the hydro- 
mechanical processes, and would easily produce dc compo- 
nents. While we found a good relationship between the 
variations of interference and of SP ( Legouix, Remond 
and Greenbaum, 1973 )» the present results show that du- 
ring hypoxia, this relation is complicated, probably be- 




306 



Legouix & Esmond : NONLINEAR MECHANISMS 



cause the electrical nonlinearities are modxfied xn a 
complex manner by the deprivation of oxygen. 

The functional significance of the nonlinea- 
rities is a matter of debate. While Evans, Rosenberg and 
Wilson ( 1971 ) reported that the discharges of the audi- 
tory nerve fibers do not reflect the existence of a non- 
linear filter, other data obtained in the cochlear nu- 
cleus do show nonlinear effects ( Rose et al. 197^ )• In 
a previous paper, we reported ( Legouix, Remond and 
Greenbaum, 1973 ) that interference can explain the oc- 
curence of the two-tone inhibition observed on the audi- 



tory nerve fibers ( Pig. 3 )• 





Suppressor Frequency [kHz], 70dB. 



Pig. 3* Upper curve: typi- 
cal inhibitory and excita- 
tory responses areas for an 
auditory nerve fiber sti- 
mulated by a tone at a 
characteristic frequency 
at a level noted by the 
small triangle ( from 
Sachs, 1969 )• 

Below: a curve showing the 
depression of CM recorded 
in the basal turn of the 
cochlea as a function of 
the suppressing frequency. 
The analogy between the 
two curves suggests that 
two-tone inhibition can 
be explained by the sup- 
pression ( interference ) 
demonstrated on CM. 

( begouix et al. 1973 )• 




307 



Legouix & Remond: NONLINEAR MECHANISMS 



It is very likely that the nonlinearities which 
are reflected in interference participate also in the 
tuning of the basilar membrane movements and of the rela- 
ted CM, If, as we suggested, the nonlinearities are grea- 
ter for frequencies above and below the resonance, it 
would increase the tuning. However, recent findings sug- 
gest that the possible role of these nonlinearities is 
complex and deserves further investigations. 

References 

von Bekesy, G. (I960). Experiments in Hearing, 
(McGraw-Hill, New York, I960), 
de Boer, E., and Six, P.D. (I960). ” The Cochlear Diffe- 
rence Tone,” Acta Oto-Laryngol. 51 » 64. 

Dallos, P. (1970). ” Combination tones in Cochlear Micro- 
phonic Potentials, ” in: Frequency Analysis and Periodi- 
city Detection in Hearing, ( R.PLOMP and G.F.S. 
Smoorenburg , Eds . ) . 

Durrant, J.D., and Dallos, P. (1972). " Influence of 
Direct-Current Polarization of the Cochlear Partition 
on the Summating Potentials," J.Acoust .Soc .Am. 52, 

542-552. 

Eldredge, D.H., and Miller, J.D. (1971). ” Physiology 
of Hearing,” Annual Review of Physiology, vol. 53- 
Engebretson, A.M., and Eldredge, D.H. (1968). ” Model for 
the Nonlinear Characteristics of Cochlear Potentials," 
J.Acoust .Soc .Am. 44, 5^6-55^. 

Evans, E.F. (1972). " The Frequency Response and Other 
Properties of Single Fibers in the Guinea-Pig Cochlear 
Nerve," J. Physiol. (Lond.) 226, 263-287. 

Evans, E.F., Rosenberg, J., and Wilson, J.P. (1971). 

" The Frequency Resolving Power of the Cochlea," 

J. Physiol. (Lond.) 216, 58-59P. 




308 



Legouix & Hemond: NONLINEAR MECHANISMS 

Johnstone, B.M., and Boyle, A.J. (1967). " Basilar mem- 
brane Vibration Examined with Mossbauer Technique," 
Science, 158, 389-590 • 

Kim, D.O., Molnar, C.E., and Pfeiffer, R,R. (1973). " A 
System of Nonlinear Differential Equations Modeling 
Basilar-Membrane Motion," J.Acoust.Soc.Am. 5^,1517-1529. 

Legouix, J.P., and Chocholle, R. (1957). ” Modification 
de la distorsion du potential microphonique par la 
polarisation de l*organe de Corti, " C.R.Soc.Biol. , 151 
(2), 1851-1854. 

Legouix, J.P., and Pierson, A. (1973). " Mechanism of 
the Short-term Poststimulatory Depression of the Coch- 
lear Microphonics (Hysteresis)," J.Acoust.Soc.Am. 54, 
16-21. 

Legouix, J.P., Remond, M.C., and Greenbaum, H. (1973). 

" Interference and Two-tone Inhibition," J.Acoust.Soc. 
Am. 53, 409-419. 

Rhode, W.S. (1971). " Observations of the Vibration of 
the Basilar Membrane in Squirrel Monkeys Using the 
Mossbauer Technique," J.Acoust.Soc.Am. 49, 1218-1231. 

Rhode, W.S., and Robles, L. (1973). ” Evidence for Non- 
linear Vibrations in the Cochlea from Mossbauer ex- 
periments," J.Acoust.Soc.Am. 54, 268 (A). 

Rose, J.E., Kitzes, L.M., Gibson, M.M., and Hind, J.E. 
( 1974 ). " Observations on Phase-Sensitive Neurons of 
Anteroventral Cochlear Nucleus of the Cat Nonlinearity 
of Cochlear Output," J. of Neurophysiol. 1, 218-253. 

Sachs, M.B. (1969). ” Stimulus Response Relations for 
Auditory-Nerve Fibers: two-tone stimuli," J.Acoust. 
Soc.Am. 45 , IO25-IO36. 




309 



Legouix & Remond; ROULIIEAR FiECHARISMS (APPENDIX) 



APPENDIX 



Several hypothesis may be proposed concerning 
the possible role of nonlinearities in the cochlear func- 
tions. In a recent paper, Kim, Molnar and Pfeiffer (1973) 
gave a detailed review of the phenomena which can be ex- 
plained by the nonlinear mechanisms. The nonlinearities 
which are observed in the mechanical vibrations of the 
basilar membrane ( Rhode and Robles, 1975 )» ( Legouix 
et al. 1973 ) must modify the tuning of the basilar mem- 
brane. The mechanoelectrical nonlinearities which are ob- 
served on CM can possibly modify the excitation of the 
auditory nerve fibers and consequently their tuning. It 
appears most likely that the two sorts of nonlinearities 
are governed by a similar function of the frequency and 
are mainly related to the amplitude of the basilar mem- 
brane movement . 

One important point is to determine this func- 
tion with precision. It seems that the decrease of CM 
produced by an interfering frequency is a good index of 
the nonlinearities. Dallos indicated ( this meeting ) 
that the interference function presents one single peak 
at the resonance frequency or slightly above. We reported 
anteriorly ( Legouix et al. 1973 ) that, frequently, two 
peaks were observed and, sometimes, one single peak which 
is apparently the same as described by Dallos. The rea- 
sons for these variations are not clear to us. It might 
be suggested that some difficulties arise because the 
electrodes are recording more than one single set of hair 
cells or that the suppressing tones in the lower frequen- 
cy range generate distorsion products able to fall in the 




310 



Legouix & Remond: RONLINEAR MECHANISMS (APPENDIX) 



suppressing zone. In any case, with suppressing tones of 
low intensity, the maximum of interference is at frequen- 
cies above the resonance frequency. Ihis fact is in agree- 
ment with the nonlinearity which was observed in the ba-? 
silar membrane movements by Rhode and Robles ( 1973 ) foi* 
frequencies which are above the resonance frequency. This 
nonlinearity can help in sharpening the cut-off of the 
response curves on the high frequency side. If another 
zone of nonlinearity exists below the resonance, as we 
suggested, it could provide another sharpening on the low 
frequency side. It would be anyhow less important than 
on the high frequency side ( Fig. 4 ). 




Pig, 4. Amplitude of CM as a function of the frequency of 
an interfering tone of constant intensity, in (a) the in- 
tensity is 67 dB SPL, in (b) the intensity is 80 dB SPL. 
The test tone (, indicated by the arrow ) is fixed at 
65 dB SPL. With high intensities, the interference is ob- 
served for lower frequencies. 

For higher intensities, we observed that the 
interference function extends towards the lower frequen- 




311 



Legouix & Remond: UONLINEAR MECHANISMS (APPENDIX) 



cies and may produce an attenuation which probably tends 
to flatten the resonance curve. This would agree with the 
widening of the response curves of the auditory nerve fi- 
bers when the intensity reaches a certain level. 

In the case of the two tones inhibition, the 
test tone is chosen at the characteristic frequency of 
the explorated region and when the interfering tone is 
close to this frequency, additions ( and beats ) occur 
which divide the suppressing zone in two clear classes, 
below and above the resonance frequency. They seem to 
correspond to the classical inhibitory zones observed on 
the single fibers of the auditory nerve. 

It is to be noted that the shape of the curve 
of interference as a function of frequency, obtained with 
suppressor tones of relatively high intensity, is stron- 
gly reminiscent of the responses curves of the auditory 
nerve fibers and also of the psychophysiological masking 
curves as shown by Zwicker ( this meeting ). 

This is due, probably, to the fact that inter- 
ference is a function of the amplitude of the basilar 
membrane movement. In these conditions, CM amplitude Is 
reflecting the tuning of the basilar membrane more accu- 
rately than the classical measurement. In fact, the tu- 
ning derived from these curves appears greater that for 
the movement of the basilar membrane because the non- 
linearity increases near the resonance frequency. 




312 



COCHLEAR MICROPHONIC CORRELATES OF CUBIC DIFFERENCE TONES* 

PETER DALLOS and MARY ANN CHEATHAM 
Northwestern University, Evanston, Illinois. USA 

INTRODUCTION 

Our purpose is to underscore our previous contentions (Dallos, 1969; 
1970; 1973a; 1973b) that cubic difference components of the type 2f^-f2 
do not possess a direct correlate in the normal cochlear microphonic 
potential, and thus in the gross motion pattern of the cochlear parti- 
tion. The important implication of all available psychoacoustic evidence 
(Plomp, 1965; Zwicker, 1955; Goldstein, 1967; Smoorenburg, 1972; and 
others) and that of the single-unit recordings of Goldstein and Kiang 
(1968), is that the cubic difference components (CDC's) are analyzed 
in the cochlea at a place that corresponds to their frequency. Since 
the only experimentally demonstrated frequency analyzer in the cochlea 
is the basilar membrane, which performs its analysis via the traveling 
wave mechanism (von Bekesy, 1960), one would expect that traveling waves 
corresponding to the various CDC^s could easily be shown. In fact, our 
work on the normal cochlear microphonic (CM) cited above, and the 
(negative) attempts of Wilson and Johnstone (1972) to measure 2f^-f2 
in basilar membrane motion tend to indicate that significant CDC content 
does not exist at this stage of the analysis. Some work of Smoorenburg 
(1972) and of Sachs (1974), utilizing subjects having cochlear patholo- 
gies, also tend to lead toward this conclusion. 

While the CM data appear to indicate the absence of a "properly 
behaving" 2f^-f2 component, it was pointed out (Goldstein, 1972 and 

*This report is based on Dallos and Cheatham, 1974. 




313 



Dallos and Cheatham: MICROPHONIC CORRELATES OF 

private communication) that we have not addressed ourselves to a 
possible interpretation of the negative results, namely that well-behaved 
CDC^s are present but that they are rendered unmeasurable by the inter- 
ference phenomenon. Interference in the cochlea is a well-known effect 
resulting in the alteration (usually diminution) of the CM response to 
one input in the presence of a second tonal signal (Black and Covell, 
1936; Wever et £l. i 1940; Engebretson and Eldredge, 1968; Legouix al. , 
1973; Ferraro and Dallos, 1973; Dallos. et al. , 1974). It could be argued 
that the 2f^-f2 content of the CM that would be analogous to the psycho- 
acoustically observed CDC is effectively eliminated, or diminished, by 
the interfering effect of the stronger primaries. In fact, we have 
reported some data that appear to contradict this notion (Dallos, 1969). 
It was observed that tonal interference upon a CM combination component 
takes place by the interference of one primary upon the other, and not 
by the direct interference of either primary upon the CDC. While the 
1969 results provided a glimpse of how interference modifies the CDC*s, 
it is necessary to produce more systematic information before it can be 
safely said that CM correlates of the psychoacoustically observed CDC’s 
are truly lacking. The amount of change in the CM due to interference 
is strongly dependent upon the relative frequencies of the two tones 
present and somewhat dependent upon their relative intensities. A sound 
is most effective as an interfering tone if its frequency is just above 
the best frequency of the electrode location. This can be demonstrated 
with the aid of Fig. 1 where the magnitude of the CM recorded from the 
basal turn of the cochlea in response to a 7000 Hz tone at a stapes 




314 



Dallos and Cheatham; MICROPHONIC CORRELATES OF 2f^-f2 

Fig. 1. Relative magnitude of the CM in response 
to a 7000 Hz (arrow) tone presented at 
1 % stapes displacement level as the 
function of the frequency of an inter- 
fering tone which is presented at a 
constant stapes displacement of 10 S. 
Recording is from the first turn. 
(Adapted from Dallos et al., 1974). 



displacement* of 1 A is shown as a function of the frequency of an inter- 
fering tone that is presented at the stapes displacement of 10 8. The 
best frequency of the first turn electrode location ranges between 9 
and 13 kHz, the most usual value being 12 kHz. It is apparent from the 
figure that significant interference is confined to a relatively narrow 
band of interfering frequencies which is at or above the best frequency. 
Distant from the best frequency the interference is negligible. All 
interference functions seen by us can be exemplified by the plot in 
Fig. 1; in other words, they all demonstrate that there is one and only 
one frequency band where interfering tones are effective. This is in 
contrast to the recent findings of Legouix et al^. (1973) who contend that 
there are two effective frequency regions, one below and one above the 
best frequency. The reason for our differences is not clearcut. It is 




Interference tone frequency 



o 



When the input quantity to the ear is designated as stapes displace- 
ment, these measurements were obtained as follows. At any frequency the 
sound level at the eardrum was adjusted to compensate for the effect of 
the average middle ear transfer function of the guinea pig. Based on the 
measurements of Johnstone and Taylor (1971) and of Wilson and Johnstone 
(1972), if constant stapes displacement was desired then sound was kept 
constant up to 400 Hz, beyond which its level was increased at a rate of 
8 dB/octave. It can be estimated, that at 1 8 stapes displacement the 
sound pressure at the eardrum is 30 dB (re 0.0002 dyne/cm^) for frequen- 
cies less than 400 Hz. More details on the process of compensating for 
the middle ear can be found in Dallos (1973b). 




315 



Dallos and Cheatham: MICROPHONIC CORRELATES OF 

expected that major interference would occur upon the lower frequency 
members of a multi-tone complex. Thus there clearly exists a potential 
that f^ and fg could create a significant interference with the CM 
components corresponding to the cubic difference tones which of course 
are below the frequencies of the primaries. An experimental evaluation 
of this possibility is presented below. 

NEW RESULTS* AND DISCUSSION 

We wish to approach the question of the nature of the 2f^-f2 content 
of the CM by studying tuning curves for these components and comparing 
them to those obtained for single frequency inputs. Our means of 
doing this is to maintain a constant frequency ratio between the two 
primaries and to present both at the same approximate stapes displacement. 
The measured 2f^-f2» or other distortion components, are then plotted in 
the form of a tuning curve. Most data reported here were obtained for 
the frequency ratio: f2/f]^=l*4. The means of plotting the tuning curves 
and their interpretation deserve some discussion. Any distortion 
component can be graphed in two ways as a function of frequency. One 
can elect to plot a distortion component magnitude at the actual fre- 
quency of the distortion product in question, that is to graph the 
magnitude of a fourth harmonic CM response to a 200 Hz fundamental at 
800 Hz; or one might choose to plot this value at the frequency of the 
fundamental, i.e., at 200 Hz. When the input is a two-tone complex and 

3|c 

In all experiments that are described in this paper the data were 
collected from anesthetized guinea pigs. The cochlear potentials were 
obtained from differential electrodes placed in either of the three lower 
turns of the cochlea. Sound was delivered in a closed system and was 
monitored near the eardrum. All measurements were taken with a 3 Hz 
bandwidth frequency analyzer. Details of the recording methods are 
described in several past publications; they are summarized in Dallos 
(1973a). 




316 



Dallos and Cheatham: MICROPHONIC CORRELATES OF 

combination components are of interest, then plotting at the ’’fundamental” 
frequency means plotting at the average frequency of the primaries: 
(f^+f2)/2. Depending on the means of plotting and on whether or not 
the distortion component is mediated by a traveling wave, four different 
configurations can be obtained. Let us consider first the conceptually 
simplest, and traditionally assumed, case that all components possess 
their own traveling wave. In this situation a distortion component 
having a frequency of 500 Hz would have a similar spatial pattern as an 
externally introduced tone of 500 Hz; thus the tuning curve of the 
distortion component when plotted at its own frequency should peak at the 
same place as the ordinary fundamental tuning curve. This should apply 
to any distortion component, thus harmonics of any order or combination 
components of any type, mediated by traveling waves, should generate 
nesting tuning curves that peak at the best frequency of the electrode 
location. 

If distortion components are not processed in the cochlea according 
to a mechanical frequency analysis, in other words, if they are not 
mediated by traveling waves then when plotted at the fundamental fre- 
quency it is expected that all components would show a peak, and produce 
nesting tuning curves, at the maximum of f^. Finally, when all compo- 
nents are plotted at their own frequency but it is assumed that traveling 
waves do not precede the generation of CM distortion components, the 
2 f^-f 2 component should peak where (f^+f 2 )/ 2 =f (where f^^ is the best 
frequency of the electrode location). This condition when f2/f2^=1.4 is 
substituted, yields a maximum at fj|/2. It is clear that the relative 
position of the distortion plots — and their relation to the fundamental 
curve — is indicative of whether or not the distortion products are 




317 



Dallos and Cheatham: MICROPHONIC CORRELATES OF 

accompanied by traveling waves of their own. 

All of our data show that when obtained at low and moderate driving 
levels, any CM distortion component peaks in the vicinity of the best 
frequency of the electrode location when these components are plotted 
at the fundamental frequency. One could thus conclude that CDC’s appear- 
ing in the CM behave as all other types of distortion components, that 
they are in no way distinguished from other nonlinear products, and 
that all such products appear with maximal strength in the region of the 
primaries. Before these conclusions can be found acceptable it is 
necessary to investigate the effect of interference upon the 2 f^-f 2 
content of the CM. One question that requires an answer, for example, 
is whether a truly dominant response component corresponding to a travel- 
ing wave might not be simply obliterated as a consequence of tonal inter- 
ference. To examine this possibility let us assume that 2f^-f2 
mediated by traveling waves and let us consider the interference effect 
by the lower primary, f^, upon a pure tone response that simulates 

2f -f . To state it differently, let us question what the degree of 
X ^ 

interference is on the CM response to a pure tone whose frequency 

corresponds to a 2 ±-^-f 2 by another pure tone whose frequency corresponds 

to f^. Here f2/fx=l*4, thus f^/(2f^-f 2)=l/0.6. The frequency range of 

interest is covered by two tones whose frequency ratio is 1/0. 

and the CM response to the probe is measured both alone and in the 

presence of the interfering tone. The ratio of CM responses thus 

generated is considered the measure of interference. In most situations 

the level of f ^ was 10 8, while that of f , was 1 8. 
int probe 

In Fig. 2 a 1 8 fundamental plot is presented for reference, and in 
addition the 2 f^-f 2 tuning curve obtained at 10 8 is also shown plotted 




318 



Dallos and Cheatham: MICROPHDNIC CORRELATES OF 2fj^-f2 




Fig. 2. Fundamental (f ) and 
2 f^-f 2 tuning curves 
from the second coch- 
lear turn. The basic 
sensitivity of the 
electrode location 
is indicated by the 
fundamental tuning 
curve obtained at an 
input of 1 X, while 
the 2 fj^-f 2 plot is 
given on the basis of 
a constant 10 8 stapes 
displacement input. 

The latter is plotted 
at the components ' own 
frequency. A third 
plot [(2f ]l^“f 2 )^riMD 1 



depicts how the 
tuning curve changes 
when it is linearly 
compensated for 
possible interference 



by upon 



2fi-f2 



itself. The compensa- 
tion is obtained by 
measuring the inter- 

fj^ upon a pure tone whose frequency is equal to 2 f^^-f 2 and 
compensating the actual 2f^^-f2 data for the changes in the CM of the 
simulating tone f2/f]_=1.4 (from Dallos and Cheatham, 1974). 



ference of 



at its own frequency. The basic feature of the 10 8 plot can be charac- 
terized by the peak at 1500 Hz which is close to fjy|/2, a finding which 
reflects a lack of a traveling wave. The 2f^-f2 tuning curve can be 
linearly compensated for in order to correct for the presumed inter- 
ference effect by f^. This is accomplished on the basis of the inter- 
ference measures obtained in the simulation experiment. This compensated 
2f^-f2 tuning curve is also included in Fig. 2. The striking character- 
istic of this curve is that its shape is not significantly different 
from the curve reflecting the original 2fj^-f2 data. The peak of the 
new curve coincides with that of the old one, and furthermore the minor 
peak at 3600 Hz (which might signify the presence of a traveling wave 




319 



Dallos and Cheatham: MICROPHONIC CORRELATIS OF 

component) became, if anything, less significant. One may conclude from 
this exercise, and from numerous others that we have completed on 

plots obtained from all three cochlear turns, that the reason for 
our inability to demonstrate in the cochlear response a significant 
2 f^-f 2 component which is preceded by a traveling wave can not be sought 
in the interference of the primaries upon this response. 

We must not conclude from the above demonstration that interference 
has no effect on the CDC's recorded in the CM. In fact, tonal inter- 
ference can and does modify the shape of the tuning curves belonging to 
the various combination components. The modification, however, takes 
place via the mutual interference of the two primaries, and not through 
their direct influence upon the CDC's themselves. The latter contention 
was supported above, while we have already mentioned the former in the 
introduction. 

When correction is made for the mutual interference of the two 
primaries then 2 fj^-f 2 plots become generally much smoother than those 
showing the raw experimental data. Moreover, such curves, when plotted 
at the fundamental frequency, tend to have their maxima at the same 
frequency where the fundamental tuning curve peaks. Thus, if the 
2f]L“^2 plot of Fig. 2 would be corrected for the mutual interference of 
the primaries then the peak, originally occurring at 1500 Hz, would move 
to 1250 Hz which is exactly (this plot is prepared with the 

distortion component graphed at its own frequency, hence the peak is 
at f ij/2 instead of f . 

CONCLUSIONS 

Our intent in this communication has been to tie up some ’’loose 
ends” and thus to reaffirm our earlier contentions that the properties 




320 



Dallos and Cheatham: MICROPHONIC CORRELATES OF 2f 

of CDC*s recordable in normal CM do not reflect the salient character- 
istics of either the psychoacoustically observed (e.g. Goldstein, 1967) 
or neurophysiologically recorded (Goldstein and Kiang, 1968) CDC’s. The 
present discussion is focused on the discrepancy between the apparent 
cochlear locations where CEKU’s appear with maximal strength in the CM, 
and where they are presumably analyzed in order to produce their charac- 
teristic perceptual features. All available information tends to 
indicate that CM distortion components, irrespective of their order, are 
localized in the region of the traveling wave maximum. We contend that 
when recorded with differential electrodes at low sound levels, the CM 
is sufficiently representative of the displacement characteristics of 
the basilar membrane, that inferences about the spatial location of 
displacement patterns can be drawn from CM data with some confidence. 

If our arguments are correct, then it is implied that the CDC’s that 
are generated in the region of the primaries are not reanalyzed by the 
basilar membrane via its familiar traveling wave pattern. Since both 
the single fiber response properties of the 8th nerve and all pertinent 
psychoacoustic observations indicate that CDC’s are processed at a place 
in the cochlea that corresponds to the actual frequency of the CDC in 
question, and since we have no reason to doubt the validity of these 
observations, we are clearly confronted with an apparent conflict.* 

While some tentative suggestions may be made, the resolution of this 
conflict is very obscure at the present time. The obvious benefit that 

3k 

One is not justified to invoke species dependent effects to explain 
the discrepancies. We have been successful in replicating the pertinent 
features of 2f]^-f2 behavior, seen by Goldstein and Kiang (1968) in single 
fibers of the auditory nerve of the cat, in the responses of single units 
of the guinea pig’s auditory nerve. 




321 



Dallos and Cheatham: MICROPHONIC CORRELATES OF 



can be gained from the demonstration that CDC's are not present in the 
gross motion pattern of the basilar membrane is that the validity of a 
class of models that associate the CDC-producing nonlinearity with the 
basilar membrane or with simple hydrodynamic processes of the cochlea 
now need to be questioned. To state it in different terms, whatever 
nonlinear process is responsible for the genesis of the psychoacoustical- 
ly observed CDC, it ought to have such characteristics that its distor- 
tion products are not distributed throughout the cochlea by traveling 
waves corresponding to their own frequency. In other words, the CDC’s 
are not analyzed according to their spectra by the basilar membrane. 

ACKNOWLEDGMENTS 

This work was supported by grants from the National Institute of 
Neurological Diseases and Stroke, NIH. Dr. J*A. Ferraro contributed to 
the collection of the data treated in this paper. 

REFERENCES 

Bekesy, G. von (1960), Experiments in Hearing , 745 pages. McGraw-Hill 
New York. . 

Black, L.Jo, and Covell, W.P„ (1936). ”A Quantitative Study of the 
Cochlear Response,’* Proc. Soc. Exp. Biol, Med. 33, 509-511. 

Dallos, P. (1969). ’’Combination tone 2fj^-fj^ in microphonic potentials,” 

J. Acoust. Soc. Amer. 46, 1437-1444, 

Dallos, P. (1970). ’’Combination tones in cochlear microphonic potentials,” 
in Frequency Analysis and Periodicity Detection in Hearing , R. Plomp 
and G. Smoorenburg, Eds. (A.W. Sijthoff, Leiden) 218-226. 

Dallos, P. (1973a). The Auditory Periphery : Biophysics and Physiology 

(Academic Press, New York) 566 pages. 

Dallos, P. (1973b). ’’Cochlear Potentials and Cochlear Mechanics,” in 
Basic Mechanisms of Hearing (A. W>ller^ ed. , Academic, New York) 
335-372. 





322 



Dallos and Cheatham: MICEOPHONIC CORRELATES OF 2f -f 

JL A 

Dallos, P. and Cheatham, M.A. (1974). "interference in the cochlear 
Part II. Combination components," J. Acoust. Soc. Amer. , to 
be published. 

Dallos, P., and Sweetman, R.H. (1969). "Distribution Patterns of 
Cochlear Harmonics," J. Acoust. Soc. Amer., 45, 37-46. 

Dallos, P. , Cheatham, M.A, and Ferraro, J.A. (1974). "Cochlear Mechanics, 
Nonlinearities, and Cochlear Potentials," J. Acoust. Soc. Amer., 
in press. 

Engebretson, A.M. , and Eldredge, D.J. (1968). "Model for the Non-linear 
Characteristics of Cochlear Potentials," J. Acoust. Soc. Amer., 

44, 548-554. 

Ferraro, J.A, and Dallos, P. (1973). "Cochlear microphonic interference 
effects in the guinea pig," Presented at Meeting of Acoustical 
Society of America, Los Angeles. 

Goldstein, J.L, (1967). "Auditory nonlinearity," J. Acoust. Soc. Amer. 

41, 676-689. 

Goldstein, J.L, (1972). "Evidence from aural combination tones and 
musical tones against classical temporal periodicity theory," 
Symposium on Hearing Theory, IPO Eindhoven. 

Goldstein, J.L. and Kiang, N,Y-s. (1968). "Neural correlates of the 
aural combination tone 2f^-f2,” Proc. IEEE 56, 981-992. 

Johnstone, B.M. and Taylor, K„J„ (1971). "Physiology of the middle ear 
transmission system," Otolaryngol. Soc. Aust. 3, 226-228. 

Legouix, J.P., Remond, M.C,, and Greenbaum, H.B, (1973). "interference 
and Two-Tone Inhibition," J. Acoust. Soc. Amer. 53, 409-419. 

Plomp, R. (1965). "Detectability thresholds for combination tones," 

J. Acoust. Soc. Amer. 37, 1110-1123. 

Sachs, R. (1974). Private communication. 

Smoorenburg, G. (1972). "Combination tones and their origin," J. 

Acoust. Soc. Amer. 52, 615-632. 

Wever, E.G., Bray, C.W. , and Lawrence, M. (1940). "The Interference 
of Tones in the Cochlea," J. Acoust. Soc. Amer. 12, 268-280. 

Wilson, J.P, and Johnstone, J.R. (1972). "Capacitive probe measures of 
basilar membrane vibration," Symposium on Hearing Theory, IPO 
Eindhoven. 

Zwicker, E. (1955). "Der ungewohnliche Amplitudengang der nichtlinearen 
Verzerrungen des Ohres," Acoustica 5, 67-74. 




323 



THE REPRESENTATION OF TONES AND COMBINATION TONES IN SPIKE DISCHARGE 
PATTERNS OF SINGLE COCHLEAR NERVE FIBERS 

R. R. PFEIFFER, C. E. MOLNAR, AND J. R. COX, JR. 

Washington University, St. Louis, Missouri, U.S.A. 

Considerable emphasis has been recently placed on nonlinear 
characteristics of the peripheral auditory system in psychophysical, 
electrophysiological , and mechanical studies. Frequently, comparisons 
are attempted between results that derive from experimental conditions 
that are too restrictive or not really comparable. We report here some 
results from response patterns of single cochlear nerve fibers in response 
to two-tone stimuli that may help to bridge some discrepancies between 
reported properties of cochlear microphonics, psychophysical responses, 
and spike discharges of single cochlear nerve fibers. In particular, 
we pay attention to the magnitude and phase of distortion products in 
these response patterns. 

Methods 

The experimental techniques are those common to single fiber 
recording: healthy animals are anesthetized with Dial in urethane; the 

cochlear nerve is exposed by surgical techniques; KCL filled micro- 
pipettes are placed visually on the nerve and manipulated manually from 
outside a sound-quieted room; and stimuli are delivered by a low 
distortion, high quality transducer - a modified Beyer DT-480 in this 
case. 

For the data reported here, all stimuli were digitally generated 
by a liLINC Computer. All stimuli were either continuous single sinusoids, 
or phase-locked (0°) dual sinusoids (two-tone stimuli) . In all cases 
stimulus frequencies were multiples of 50 HZ so that all observed 
distortion products were also multiples of 50 HZ. (This stimulus protocol 
was adapted from a procedure devised by Messrs. Eldredge and Ronkin of the 
Central Institute for the Deaf . ) 

Period histograms with a fundamental of 50 HZ were constructed for 
all stimulus conditions each of which lasted for approximately 40 seconds. 
Discrete Fourier Transforms (DFT) of these histograms yield both amplitude 
and phase of primary, harmonic, and combination tone components in the 
response patterns. Figure 1. It can be shown that these computations are 




324 



Pfeiffer, et. al.; DISCHARGE PATTERNS OF SINGLE COCHLEAR NERVE FIBERS 



PERIOD HISTOGRAM 

t 




A ; I • . 

V w - 




AMPLITUDE SPECTRUM 



Figure 1. Period histogram of responses 
to a two-tone stimulus. Abscissa is equal 
to the period of the fundamental of the 
two- tone combination while each of the two 
tones is at some integer multiple of the 
fundamental [the sixth and seventh 
harmonics in this case] . Lower plot is 
magnitude of the DFT of the period 
histogram showing components at primary 
frequencies as well as at various 
combination tones and harmonics. 




II 



equivalent to synchronization measures from histograms that are obtained 
by selectively synchronizing to the period of the combination tone of 
interest (Goldstein and Kiang, 1968) . The transform method not only 
yields all amplitudes and phases simultaneously in quantitative form, but 
is simpler and less cumbersome to compute. 

Results 

We have presented data previously on the properties of fundamental 
and harmonic components of response patterns as a function of signal 
frequency and level (Pfeiffer and Molnar, 1970) and at that time 
suggested that there was considerable similarity between those results 
and some properties of cochlear microphonics (CM) . We have found no 
reason to amend those results; and consequently, we will exclude data on 
harmonics of a single tone here and concentrate on results pertaining to 
so-called combination tones. Figures 2 and 3 show plots of both the 
amplitude and phase of f^+f^/ and f alone , A, for two 

different fibers, both from the same animal. Each specific symbol 
corresponds to conditions having the same value of f^. Each point is 
plotted at its appropriate frequency and harmonic number to emphasize 
that all stimuli and response components are some multiple of 50 HZ. 

In all cases, the stimulus levels of f^, f^, and single tones are at 
approximately 60 dB SPL. 

We note that at these stimulus levels the amplitudes of the 
combination tone components of the response are large when the primaries 
are closely spaced in frequency and that the magnitudes of the first 




325 



Pfeiffer, et, al.: DISCHARGE PATTERNS OF SINGLE COCHLEAR NERVE FIBERS 




Figure 2, Amplitude and phase of combination tone components f -f , 

dashes) in response patterns to two- tone ^ 
stimuli, as well as of the primary tone component in response pattern to 
single-tone stimuli, A. The vertical scale is relative amplitude (upper 
plot) or phase relative to the stimulus (lower plot) modulo 2 tt. The 
horizontal scale is harmonic number of the fundamental frequency, 50 
Hertz, in this and following figures. Each symbol is for the specific 
f^ given in the key. The data points are plotted at the value of the 
combination tone. The values of f^, therefore, can be calculated. 

order combination tones f^+f^ and ^ 2 “^! larger than those of second 
order, 2f^-f2 and 2f^+f^ (not shown) . Also, just as in the case of the 
CM, ”... one can state that it is the sensitivity of a particular 
cochlear location to the primary frequencies that determines the 
magnitude of a distortion component itself” (Dallos, 1973). It is also 
apparent that a single-tone stimulus, at frequencies corresponding to 
some of the combination tones, does not elicit amplitudes as large as 
those of the combination tone components elicited under two-tone 
stimulation, and similarly, no combination tone component has an 




326 



Pfeiffer, et. al.: DISCHARGE PATTERNS OF SINGLE COCHLEAR NERVE FIBERS 




Figure 3 . Same as Figure 2, but 
for a cochlear nerve fiber with 
lower characteristic frequency, 
CF. The two filled- in symbols 
represent cases where the 
combination tone frequency was 
the same for f 2'"^1 ’ 

In such cases there appears to 
be an addition of amplitudes. 
This fiber was from the same 
animal as the one shown in 
Figure 2. 




Figure 4. Phase of primary tone 
component in response pattern versus 
harmonic number for single-tone 
stimulation (open circles) compared 
to phase of primary tone components 
in response patterns to two-tone 
stimulation (solid circles) for the 
same fiber. The niimbers adjacent to 
some circles indicate the number of 
superimposed data points that were 
obtained under a different f^, f^ 
stimulus combination. 




327 



Pfeiffer, et. al.: DISCHARGE PATTERNS OF SINGLE COCHLEAR NERVE FIBERS 



amplitude larger than both of its primaries. 

The phase plots shown are for responses to single tone stimulation 
(A) as well as for the response components corresponding to each of the 
major combination tones. Of particular interest is the explanation of 
how the phases of the combination tones relate to the phases of the 
individual primary tones, as well as to the phase of responses to single- 
tone stimulation. 

Littlefield (1973) , in his investigation of the linear range of 
response of single cochlear nerve fibers, examined the relationship 
between the phases of combination tones and the phases of the primary 
tones in the response patterns for specific two-tone stimulus conditions 
of CFtf^rf^ = 10:11:12, or 10:8:9. He found that the phases of first and 
second order combination tones were related to the phases of the primary 
tones in the following manner: 



ip - cp + (p + JL . 

V'2 'l '2 = 



■ % 

2 1 2 



% 






2f^-f2 







2" i ^"2 



2(p^ + ^ + TT ; 

1 2 



where the reference for all phase measures is to the positive-going zero 
crossing of the stimulus. 

We have found that these relationships hold for all of the 
combination tones that have amplitudes above noise level, such as those 
shown in Figures 2 and 3. Further, we find that there is little 
deviation between the phase of the response components at the primary 
frequencies under two-tone stimulation and the phase of the response to 





Figure 5. Phase of combination 
tone components of response 
patterns to two- tone stimulation 
(open symbols) compared to phase 
values (solid symbols and dashed 
lines) derived from the phase 
curve from single-tone 
stimulation (shown in Figure 4) , 
using the formulas given in the 
text. 




328 



Pfeiffer, et. al.: DISCHARGE PATTERNS OF SINGLE COCHLEAR NERVE FIBERS 

the same frequencies under single-tone stimulation (Figure 4) . 
Consequently, it is possible to calculate the phase of the combination 
tones directly from the phase versus frequency curve obtained from single- 
tone stimulation (Figure 5) . These data support the notion that the 
nonlinearity that governs the generation of these combination tones can 
be described by a polynomial expansion. 

Discussion 

The amplitude and phase characteristics of the data shown here are 
consistent with a polynomial expansion of the nonlinearity governing 
the generation of the distortion products. We have not yet arrived at 
any conclusive description of that nonlinearity, but note that at least 
these properties are qualitatively similar to those reported and cited 
by Dallos (1973) for turn III cochlear microphonics. Also, these are 
consistent with those data of Kim, et. al. (1973) that negate the need 
to have the component generated by an essential nonlinearity at 

the single fiber level (Goldstein and Kiang , 1968) . 

It could also be claimed from these data that the ^ 2 ”^!' 
the 2 f^-f 2 combination tones are not a result of a cochlear combination- 
tone traveling waves. This, while being consistent with reported CM 
phenomena, is in direct conflict with single fiber data reported by 
Goldstein and Kiang (1968) , as well as some seen in our own laboratory 
(Kim, 1973) , that illustrated response components at the 2f^-f2 
combination tone, placed at the CF of a fiber, in the absence of 
response coit^onents at either f^ or f^. The discrepancy between our 
present results and the latter two reports could be due to the fact that 
the CF of the fibers presented here are two or more kilo-Hertz lower. 

The discrepancy between reported CM data (Dallos, 1973) and the Goldstein 
and Kiang, and Kim reports, on the other hand, could be due to the fact 
that the experimental conditions were not identical (Kim, 1972) . 

The consistency between the phases of the combination tones and the 
phases of response components to single tones shown here may serve as a 
means to identify components or sets of components that result from more 
than a single nonlinear source. 

We have not explored in detail any relationships between the 

|f +f I and |f -f 1 or between | 2f -f | and | 2f +f | components, but 
^x x^ x^ 

have observed that scatter plots of | f 2 1 • I ^ 2~^1 ^ approximate 




329 



Pfeiffer, et. al.: DISCHARGE PATTERNS OF SINGLE COCHLEAR NERVE FIBERS 

symmetry about a line with slope of 1 as well as concentration along 
that line. More importantly, the scatter plots show that |2f^+f2| is 
by no means negligible. In fact, for the fiber 81-2 represented in this 
report, |2f^+f2| was generally greater than |2f^-f2|. This observation 
is contrary to psychoacoustic results but is in line with observations 
of CM. We do not know, at this time, the relationship between the 
magnitude pairs as a function of stimulus level. But thus far, these 
results appear to bring the single fiber data closer to the CM data than 
heretofore expected. 

Considerable effort must yet be expended to explore the properties 
of the response patterns as a function of signal level and CF. 



APPENDIX 

There appear to be two different phenomena associated with or 
responsible for the combination tone as observed from discharge 

patterns of single cochlear nerve fibers. On the one hand, the data 
reported by us in this volume (Pfeiffer, Molnar, and Cox, 1974) show 
correlations between the magnitudes and phases of the 2 f^-f 2 component 
and of the primary components of response. In addition, in this case, 
the magnitudes of other combination tones, such as 2f^+f^ and are 

comparable to the magnitude of 2f^~f2. These relationships are consistent 
with the generation of the distortion products at the site of the receptor 
or neuron. On the other hand, data reported by others (Goldstein and 
Kiang, 1968; Kim, 1973; Dallos, 1974) show a 2f^-f^ response component in 
the absence or near absence of virtually all other response components 
including those at the primary frequencies. These observations are 
consistent with the concept of a traveling wave component at 

In the first case, the data are qualitatively similar to those 
reported for the CM (e.g., Dallos, 1974). There seems to be little 
relationship between the properties observed and those reported for the 
psychophysical measurements of Thus, the 2f^-f2 component 

described in the first case may be psychophysically undetectable. This 
idea is reinforced when one considers that the components at 2 f^+f 2 r 
^^2~^l' ^2'”^l' ^l"^^2' simultaneously of comparable magnitude 

and too are not psychophysically prominent. 




330 



Pfeiffer, et. al.: DISCHARGE PATTEPIJS OF SINGLE COCHLEAR NERVE FIBERS 



In the second case, the 2f^-f2 component of the single cochlear 
nerve fiber response is clearly dominant and appears to be the result 
of a 2f^-f2 "stimulus". Since it is not accompanied by other components - 
combination or primary - it is not unreasonable to assume that it is 
psychophysically detectable. There have been no similar observations for 
CM. 



Since both phenomena have been observed from single cochlear nerve 
fibers, there must be a transition from one case to the other. This has 
yet to be explored. Perhaps the detailed relationships between the 
phases of the combination tone components of response and the phase of 
the fundamental component of response patterns to single tones, that we 
have reported in this volume, can lead to decisive experiments that 
identify the site and origin of the 2f^-f^ correlate of the psychophysical 
phenomena. 

Acknowledgment 

These studies were supported in part by grants and contracts from 
the United States Public Health Service. We also thank Ronald Cox, 

Dekle Day, and Walter Milliken for their technical assistance. 

References 

Dallos, P. (1974) , "Cochlear Potential Correlates of Cubic Difference 
Tones", Symposiimi on Psychophysical Models and Physiological Facts 
in Hearing, Tutzing. 

Dallos, P. (1973), THE AUDITORY PERIPHERY, Academic Press, New York. 



Goldstein, J. L. and Kiang, N. Y. S. (1968) , "Neural Correlates of 
the Aural Combination Tone 2f^-f2", PROC. IEEE 56_, 981-992. 

Kim, D. O. , Littlefield, W. M. , Pfeiffer, R. R. , and Molnar, C. E. 
(1973) , "Combination Tone 2f -f in Responses of Single Cochlear 
Nerve Fibers" ; "Evidence Against Essential Nonlinearity" , 86th 
Meeting Acoustical Society of America. 

Kim, D. O. (1973) , Unpublished Data. 

Kim, D. 0. (1972) , "A Nonlinear Model for Basilar Membrane Motion 
and Related Phenomena of Single Cochlear Nerve Fibers", Doctoral 
Dissertation, Department of Electrical Engineering, Washington 
University, St. Louis, Missouri, U.S.A. 

Littlefield, W. M. (1973) , "Investigation of the Linear Range of the 
Peripheral Auditory System", Doctoral Dissertation, Department of 
Electrical Engineering, Washington University, St. Louis, Missouri, 

U.S.A. 




331 



Pfeiffer, et. al.: DISCHARGE PATTERNS OF SINGLE COCHLEAR NERVE FIBERS 



Pfeiffer, R. R. and Molnar, C. E. (1970), "Cochlear Nerve Fiber 
Discharge Patterns: Relationship to the Cochlear Microphonic" , 
SCIENCE L67, 1614-1616. 



ADDITIONAL mMPmS 

SMOORENBURG: In order to avoid confusion I wish to join Dr. Pfeiffer in 

emphasizing that the combination tones as measured by him are probably not 
the electrophysiological correlates of the combination tones reported in 
psychoacoustical studies (see the Addenda of Dr. Pfeiffer* s paper). 
Combination tones measured psychoacoustically behave like stimulus tones 
in virtually every respect. This suggests that neural units will respond 
best to these combination tones if the frequency of the combination tone 
coincides with the best frequency of the unit. Such results are mentioned 
in the discussion of Dr. Pfeiffer’s paper and were also reported by 
Goldstein and Kiang in 1968 (see reference in Pfeiffer’s paper). However, 
the data presented in Dr. Pfeiffer’s paper show combination products that 
in my view are introduced to a large extent by the nonlinear trans- 
duction in the auditory receptor (see the paper by Schroeder and J.L. 
Hall). This explains why the combination products are found if the 
frequencies of the primaries are inside the response area of the fiber 
rather than the combination tone frequency. Whether or not the frequency 
content of the period histogram has any meaning for the tones we hear is a 
matter of speculation but I believe that the neural responses that show 
maxima if the combination tone frequency coincides with the best frequency 
of the unit are the true correlates of the psychoacoustically measured 
combination tones. When I was working in the group of Drs. J.E. Rose and 
J.E. Hind we studied this type of correlate extensively. The responses 
to combination frequencies such as f^ - f^ , 2f^ - f 2 and f^ + f^ Were in 
good agreement with the psychoacoustical data. The results will be 
siibmitted to the Journal of Neurophysiology. 




332 



ON THE MECHANISMS OF COMBINATION TONE GENERATION AND LATERAL INHIBITION IN 
HEARING 

GUIDO F. SmORENBURG 

Institute for Perception TNO, Soesterberg, The Netherlands 
Outline 

This paper presents an elaboration on the discussion that concludes my 
previous paper "Combination Tones and Their Origin" (1972). 

The relation between the levels of the stimulus components and 

f 29 respectively, and the level of the combination tone 2 fi-f 2 (fi<f 2 ) sug- 
gests that this combination tone might be generated by an auditory nonlin- 
earity of an amplitude- limiting type. Another property of such a nonline- 
arity j,s that weak frequency components are suppressed by stronger ones. On 
the basis of various experiments we shall discuss the question of whether 
or not lateral suppression and the generation of combination tones can be 
attributed to the same amplitude- limiting type of nonlinearity. 

BASIC PROPERTIES OF COMBINATION TONES 

The most prominent combination tones are the difference tones corre- 
sponding to f 2 "^i» and to fi-n(f 2 -fi) where n is a small integer (Plomp, 
1965). By and large, the properties of the difference tone f 2 “i'i support 
the classical view that the auditory system is essentially linear. In this 
view distortion products may arise at higher stimulus levels and if they 
arise their course is described by a polynomial expansion of the transfer 
function: f(x) = CiX + C 2 X^ + + ..., where the nonlinear terms are rel- 

atively small (Zwicker, 1955; Goldstein, 1967). 

Properties of combination tones that are not in agreement with this 
simple polynomial-expansion description were found to some extent by Hall 
(1972) for f 2 “'^i but are typical for the combination tones of the type 
fi-n(f 2 -fi). Particularly, 2fi-f2j the cubic difference tone (CDT) which is 
the principal member of this set (n=l), has been investigated. Zwicker 
(1955) showed that, with increasing stimulus level, the level required to 
cancel the CDT had to be increased by considerably less than 3 dB per 1 dB 
predicted on the basis of the third-power term in the polynomial expansion. 
(This term would be the main contributor to the CDT.) Goldstein (1967) em- 
phasized that over a great range of sensation levels (20-70 dB SL) the can- 
cellation level is approximately proportional to the level of the stimulus 
components (a 1 dB per 1 dB increase). Results from my own experiment are 
given in Fig. 1. These results are averages for one subject. The accuracy, 
if not indicated differently, is about 1 dB. From the figure the proper- 




333 



Stnoorenburg: COMBINATION TONES AND LATERAL INHIBITION IN HEARING 




Fig, 1 . Cancellation level for 2f as a function of 
the stimulus level (Lj=L 2 ) with the frequency ratio of 
the stimulus components f^/fj as the parameter. Results 
for one subject. 



tionality is evident. At higher cancellation levels the curves tend to be- 
come flatter. Also, the cancellation level increases with a decreasing fre- 
quency difference of the primaries fijf 2 - This is a general finding and a 
basic argument for the assumed cochlear origin of the CDT. The cochlear ex- 
citation patterns corresponding to the primaries would interact more heavi- 
ly for smaller frequency differences. In addition to these results dips were 
found in the curves for f2/fi>1.20. These dips are thought to develop by 
out-of-phase contributions to the CDT from different nonlinear sources. The 
different sources may correspond to a distributed nonlinearity (e.g. in the 
cochlea; for close frequencies the dips will disappear because of reduced 
phase shifts) or to several distinct nonlinearities (e.g. middle ear and 
inner ear nonlinearity; for close frequencies the dips will disappear be- 
cause of an increasing cochlear contribution only). Of course, mixtures of 
these examples may occur as well. 

Another typical aspect af CDT behavior is found when its level is meas- 
ured as a function of the level of only one of the primaries. Whereas the 
third-power term predicts a proportional increase in the cancellation level 




334 



Smoorenburg: COMBINATION TONES AND LATERAL INHIBITION IN HEARING 

with L 2 at any level of L^, this proportionality is found only for L 2 <Li. 
Fig. 2 shows that the cancellation level decreases for This too is 




Fig. 2. Cancellation level for as a function of 

the level of the higher stimulus component. Results 
for two subjects. 

a general finding. The decrease is subject dependent. The slope may rate 
from -1 (dB per dB) almost up to zero. Fig. 2 shows the extreme cases (two 
subjects) in my data. 

DYNAMIC PROPERTIES OF THE COMBINATION TONE 

According to the third-power term in the polynomial expansion the ampli- 
tude of the distortion product 2 fi-f 2 is given by ai 2 a 2 if ai and a 2 are the 
amplitudes of the primaries f^ and f 2 , respectively. Goldstein (1967) pro- 
posed that the different behavior of the CDT can be described by normalizing 
by ^ factor (ai+a 2 )^. (One may also use a a|^+a 2 ^; the energy in the 
stimulus.) The introduced normalization may imply an adaptive nonlinearity, 
one that changes with time. The "mechanical impact" models by Crane (1966, 
1972) are examples of adaptive nonlinearities. We investigated the time 
course of the nonlinearity by measuring the forward -masking effect of the 



335 



Smoorenburg: COMBINATION TONES AND LATERAL INHIBITION IN HEARING 

CDT as a function of the duration of the stimulus fi,f 2 - The level of the 
CDT was expressed as that level of a reference masker (of frequency 2fi-f2) 
which gave the same amount of masking. The duration of the reference masker 
was always equal to the duration of the stimulus fijfa* A component f^ at 
the level used in the stimulus f i sf 2 was added to the reference masker in 
order to make the reference masker maximally similar to the stimulus. Among 
other reasons, this is required since the residual masking by f| at the 
place of the CDT cannot be neglected. The results for these observers are 
given in Fig. 3. For one subject the CDT level was also estimated from a 
backward-masking measurement (indicated by B on the abscissa). 




stimulus duration in msec 



Fig. 3. Estimates of the level of as a function 
of stimulus duration for three subjects. The estimates 
were obtained by a forward-masking paradigm. B denotes 
a single result from a backward-masking measurement. 

The results show no effect of signal duration. Apparently, the temporal 
build-up of a stimulus tone at and of the CDT are similar to the ex- 
tent investigated. No evidence for an adaptive nonlinearity is found. 




336 



Smoorenburg: COMBINATION TONES AND LATERAL INHIBITION IN HEARING 

A SIMPLE DESCRIPTIVE NONLINEARITY 

Since we have no evidence for an adaptive nonlinearity a further exami- 
nation of static (non-memory) transfer functions seems fruitful. The results 
in Fig. 1 suggest that the nonlinear term must be important. Even at low 
stimulus levels the CDT level runs up to values only 10 dB below the stim- 
ulus level. So why should we not consider transfer functions that are pre- 
dominantly nonlinear, for example f(x) = In that case cancellation lev- 

els should not be interpreted as output levels. The cancellation tone too is 
subject to the nonlinearity. The cancellation tone should be interpreted as 
that component, added to the input, that zeros the output component of its 
frequency. If cancellation is interpreted in this way it can be proved math- 
ematically that for power functions like f(x) = the cancellation ampli- 
tude is proportional to the stimulus magnitude. This holds for all odd inte- 
ger powers (which generate 2fi-f2) as well as for powers p in between zero 
and one if the transfer function is defined as follows: f(x) = sign(x)|xl^. 
The latter functions (0<p<l) give interesting predictions for level changes 
of the individual stimulus components. For L2<L2 the prediction is a pro- 
portional increase in the cancellation level with L2 and for L2>Li an in- 
versely proportional decrease independent of p if 0<p<l. Fig. 4 presents 
the predictions for Li=L2 and Li=const. where p=0.6 (solid curves; these 
curves are completely determined after the choice of p). Unquestionably, 
these predictions are much closer to the experimental data than the pre- 
dictions based on the classical polynomial expansion. 

In thinking of a plausible cochlear origin of the CDT, one might argue 
that the interpretation of the cancellation procedure is not quite correct. 
Stimulus components are distributed over the cochlea according to their 
frequency and the CDT behaves like any stimulus component. Therefore, the 
cancellation tone might by-pass the area where f| and f2 interact and can- 
cel the CDT at its proper place. In order here is the question of whether 
the CDT is cancelled at its presumed place of origin (where f^ and f2 inter- 
act) or at the place corresponding to its frequency. In the latter case, the 
cancellation tone, after it passes through the nonlinearity without inter- 
ference by f^ or f2, levels off with the combination component 2fj-f2 at the 
output of a nonlinear device fed with f^ and f2- The results of such a cal- 
culation are added to Fig. 4 in dashed lines. In comparison with the first 
predictions the values are generally lower and the slopes of the curves as 




337 



Smoorenburg; COMBINATION TONES AND LATERAL INHIBITION IN HEARING 




level of stimulus component (s) in dB 



Fig. 4. Predictions for the first (solid curves) and 
second (dashed curves) interpretation of cancellation 
based on a power device with power p= 0.6. Predictions 
of the pulsation threshold are also given by the dashed 
curves. Straight lines are predictions as a function of 
peaked curves as a function of (Lj = 

cons t) . 

a function of I 2 only are steeper. 

TWO MEASURING METHODS 

In view of the differing predictions from the two interpretations of 
the cancellation procedure another measuring method that relates directly 
to the second interpretation is interesting. The essence of the latter in- 
terpretation was that the probe tone (cancellation tone) by-passes the 
stage where the primaries fx and interact such that it is not affected 
by the stronger primaries. A simple way of accomplishing this by-passing is 
a nonsimultaneous presentation of stimulus and probe tone. We adopted the 
alternation method recommended by Houtgast in this symposium. The stimulus 
and the probe tone are alternated at a rate of 4 Hz and the level of the 
probe tone is increased until a pulsation of this tone becomes just audible. 




338 



Smoorenburg: COMBINATION TONES AND LATERAL INHIBITION IN HEARING 

Just below this "pulsation threshold" a continuous tone is perceived and 
this is interpreted as an equality of the levels of the CDT generated by the 
stimulus fi,f 2 and the probe tone of frequency 2fi-f2. The results from this 
method are presented in Figs. 5 and 6 for Li=L 2 and L]^=const. , respectively. 
The thin curves represent cancellation results which were collected at the 
same time. 




Fig. 5 . Pulsation threshold (heavy curves) and cancel- 
lation level (thin curves) for 2f^-f2 as a function of 
stimulus level (Lj^L^,) with parameter Results 

for one subject. 

Fig. 5 shows that the difference between the cancellation levels and 
the pulsation thresholds decreases with increasing frequency ratio of the 
primaries. This might be understood on the basis of the two interpretations 
of the cancellation procedure. For small frequency differences there is not 
much separation of the frequency components in the cochlear representation 
and, thus, the first interpretation, where all frequency components pass 
through the same channel, might be favored. In agreement with the descrip- 
tive nonlinearity the pulsation thresholds, which relate to the second inter- 
pretation, were found to be lower. At greater frequency differences the sec- 
ond interpretation of the cancellation procedure might be favored. In that 




339 



Smoorenburg: COMBINATION TONES AND LATERAL INHIBITION IN HEARING 




sensation level in dB 



Fig. 6. Pulsation threshold (heavy curves) and cancel- 
lation level (thin curves) for as a function of 

L^; 55 dB SL. Results for two subjects. 

case the probe tone would not be subject to interaction from the primaries 
for both methods, either because of cochlear frequency separation in case 
of cancellation or because of the nonsimultaneous presentation in the pul- 
sation technique. This would explain the nearly equal results for both meth- 
ods at the greater frequency differences. 

EFFECTS OF LATERAL SUPPRESSION 

Besides the generation of combination tones, amplitude-limiting nonlin- 
earities like the power device discussed before with 0<p<l also show sup- 
pression effects. Weak frequency components are suppressed by stronger ones. 
Psychophysically , Houtgast first demonstrated suppression effects in hear- 
ing with nonsimultaneous probes. Results from measurements with the pulsation 
technique are presented in Fig. 7 . Here we studied the suppression of a 
stimulus component 2fi-f2 by f^ in absence of f2; f2/fi=l/6 oct, fi=lkHz, 
L ]^=55 dB SL. The levels of the component 2f]^-f2 exceeding L^ appear to be 
unaffected whereas for levels lower than L^ increasingly smaller values than 
the stimulus values are measured. This is the suppressing effect of f ^ . A 
plateau is reached when the activity introduced by 2fi-f2 is smaller than 




340 



Smoorenburg: COMBINATION TONES AND LATERAL INHIBITION IN HEARING 




Fig. 7. Suppression of a tone of frequency 2fj”f2 by 
fj only; fj = 1 kHz, f^/fj = 1/6 oct. , = 55 dB SL. 
Results for one subject. 



the activity introduced by fj^ at the place corresponding to 2f2“f2* ^ curve 
calculated on the basis of a power function with p = 0.6 (the value of p 
used in the combination tone calculations) is also given in Fig. 7. 

The calculations show suppression effects whenever a stronger tone (the 
suppressor) is added to the first tone. The decrease of the output level of 
the first tone can be expressed in a reduction of the input level of the 
first tone presented alone such that its output level matches the output 
level of the first tone in the presence of the suppressor. The input lev- 
els of the single tone are given in the solid curve. 

The description given so far is incomplete. Since the suppressor is 
stronger than the first tone, the first tone is not only suppressed but al- 
so masked. The type of suppression described so far is related to the stu- 
dy by Rose et al, (1974) in which period histograms of two- tone stimuli are 
analysed. For two tones well within the response area they found that the 
amplitude at which one tone is represented in the period histogram dimin- 
ishes when a second tone is introduced at a sufficiently high level. How- 




341 



Smoorenburg: COMBINATION TONES AND LATERAL INHIBITION IN HEARING 

ever, in order to describe the psychophysical findings where the suppressed 
tone is perceived distinctly and also the electrophysiological findings 
where tones well outside the response area of the unit suppress tones in- 
side the response area we need to invoke a mechanism that separates the 
frequency components of the stimulus after the nonlinear interaction. To 
explain the major effect of two-tone suppression, that is the suppression 
of a tone by another tone of a higher frequency, the traveling-wave mecha- 
nism might be a candidate. Suppression might take place at the site where 
the suppressor introduces the greatest excitation. At that site the tone 
of lower frequency is suppressed (and masked) but as it travels on to its 
proper place the excitation by the tone of lower frequency grows and ex- 
ceeds the excitation by the suppressor. In addition to this mechanism oth- 
er mechanisms that also work in the opposite direction can be proposed. 

Such mechanisms of additional frequency separation or "second filters" may 
have a mechanical nature, e.g. a frequency selectivity or directional se- 
lectivity of the hair-cell transducer (see comment by Duifhuis), or they 
may have a neural nature such as coincidence networks (comb filters). 

DISCUSSION 

Apparently, the cancellation data of the CDT and the tone-on-tone sup- 
pression data can be mimicked by a simple amplitude-limiting nonlinearity. 
However, the CDT pulsation data do not seem to agree. On the one hand, for 
small frequency differences the pulsation threshold of the CDT increases 
less than proportionally to the stimulus level. In addition, the differ- 
ence between the cancellation data and the pulsation data increases with 
the stimulus level. For a power device this would imply a decreasing pow- 
er p with increasing stimulus level. On the other hand, the cancellation 
data tend to level off at higher stimulus levels which suggests that the 
system behaves more linearly, hence greater p. A third finding that seems 
to disagree with the predictions is that the slopes of the pulsation 
curves as a function of L2 only are not steeper than +1 or -1 for l2<li 
or L2>L2, respectively. 

Before we have even touched upon the effects of cochlear frequency dis- 
tribution it seems already for small frequency differences that the simple 
amplitude-limiting nonlinearity cannot account for both the generation of 




342 



Smoorenburg: COMBINATION TONES AND LATERAL INHIBITION IN HEARING 

the CDT and lateral suppression. Of course, alternative mechanisms of lat- 
eral suppression may be postulated. Inhibition of the inner hair cell out- 
put by the outer hair cell output is a compelling proposition. Such sepa- 
rate mechanisms of lateral suppression may be responsible for the differ- 
ences between the cancellation and pulsation data. But there remains the 
question of to what extent the nonlinearity generating the combination tones 
like 2 f 2 “f 2 also accounts for effects of lateral suppression. 

REFERENCES 

Crane, H.D. (1966). "Mechanical Impact: A Model for Auditory Excitation and 
Fatigue", J. Acoust. Soc. Amer. 1147-1159. 

Crane, H.D. (1972). "Mechanical Impact and Fatigue in Relation to Nonlinear 
Combination Tones in the Cochlea", J. Acoust. Soc. Amer. 508-514. 
Dallos, P. (1969). "Combination Tone 2f-j-f^ in Microphonic Potentials", J. 
Acoust. Soc. Amer. 48, 1437-1444. 

Goldstein, J.L. (1967). "Auditory Nonlinearity", J. Acoust. Soc. Amer. 41, 
676-689. 

Hall, J.L. (1972). "Auditory Distortion Products ■^2“^! 

Acoust. Soc. Amer. 1863-1880. 

Plomp, R. (1965). "Detectability Threshold for Combination Tones", J. Acoust. 
Soc. Amer. 37, 1110-1123. 

Rose, J.E., Kitzes, L.M. , Gibson, M.M. , and Hind, J.E. (1974). "Observations 
on Phase-Sensitive Neurons of Anteroventral Cochlear Nucleus of the Cat: 
Nonlinearity of Cochlear Output", J. Neurophysiol. 218-253. 
Smoorenburg, G.F. (1972). "Combination Tones and Their Origin", J. Acoust. 
Soc. Amer. 52 , 615-632. 

Wilson, J.P. and Johnstone, J.R. (1973). "Basilar Membrane Correlates of 
the Combination Tone ^^'ture 241 , 206-207. 

Zwicker, E. (1955). "Der ungewbhnliche Amplitudengang der Nichtlinearen 
Verzerrungen des Ohres", Acustica 67-74. 




343 



COMMENT ON: "On the mechanisms of combination tone generation 
and lateral inhibition in hearing" (SMOORENBURG) 

R.HELLE, H.FASTL, E . ZWICKER 

Institut fiir Elektroakustik, Technische Universitat Miinchen 

Although we agree in general with the description of the data 
given in the paper by Smoorenburg produced by cancellation- 
method, there are still some differences in the interpretation 
we would like to point out. Smoorenburg * s descriptive hypoth- 
esis seems to be too simple for the relatively complicated 
frequency-place dependence of the system, including large non- 
linearity. As we have shown, the creation of the cubic dif- 
ference tone depends on frequency-distance, i.e. place 
(Zwicker, 1968) as well as on amplitude and phase of the 
primaries (Helle, 1969/70) . This could be confirmed for narrow 
band noise as well as tonal primaries by means of masking 
experiments (c.f. Greenwood, 1972; Zwicker and Fasti, 1973). 

The cubic difference tone seems to be the result of a lot of 
partial components, originating along the cochlear partition 
as indicated by the dip in the contour of the cancellation- 
level versus primary level curve (Helle, 1969/70, Fig. 6). In 
addition, the data given there in Fig. 3 and Fig. 5 obviously 
do not show that symmetrical shape predicted by Smoorenburg 
(Fig. 4 of his paper) , but exhibit a considerable flatter 
slope for high primary levels. For L 2 < - 10 dB the de- 

pendence on level - as a first approximation - could be de- 
scribed quite simply (Helle, 1969/70, Fig. 15), even for 

different frequencies and frequency distances of the primaries. 
REFERENCES : 

Greenwood, D.D. (1972). Masking by Narrow Bands of Noise in 
Proximity to More Intense Pure Tones of Higher Frequency: 
Application to Measurement of Combination Band Levels and 
Some Comparisons with Masking by Combination Noise. 

J. Acoust. Soc. Amer. 1 137-1 1 43. 

Helle, R. (1969/70) . Amplitude and Phase des im Gehor gebilde- 
ten Differenz tones 3. Ordnung. Acustica 74 - 87. 

Zwicker, E. (1968). Der kubische Differenzton und die Erregung 
des Gehors. Acustica 206 - 209. 

Zwicker, E. and Fasti, H. (1973). Cubic Difference Sounds 

Measured by Threshold- and Compensation-Method. Acustica 
29, 336 - 343. 




344 



SUBJECTIVE PHASE EFFECTS AND COMBINATION TONES 
T.J.F. BUUNEN AND F.A. BILSEN 

Biophysics Group, Applied Physics Department, Delft University of 
Technology, Delft, The Netherlands 

I INTRODUCTION 

One of the problems in psychoacoustics is the question which 
temporal aspects of a signal are coded and contribute to the sensation. 
For example, the sensation of "roughness” seems to correspond to enve- 
lope detection (Terhardt, 1968). Another subjective aspect of sound 
which is fairly often ascribed to time-structure detection is the 
"residue pitch" or "periodicity pitch" (Schouten et al., 1962). For 
three-component signals, Mathes and Miller (1947), de Boer (1956) and 
Goldstein (1967a) have reported changes in the prominence of residue 
pitch, as well as timbre and roughness as the time structure changed 
while the power spectrum remained constant. 

Recent investigations Jiowever,have produced some arguments 
against an explanation of residue pitch by way of time structure detec- 
tion. Terhardt (1972) has summed up some of these arguments. Houtsma 
and Goldstein (1972) have shown that monotic interaction of frequency 
components is not prerequisite for the perception of residue pitch. 

This calls for a re-interpretation of the subjective phase effects as 
reported by a.o. Mathes and Miller (1947). One intuitive argument for 
ascribing phase effects to time structure detection might be the ex- 
perimental finding that the frequency difference in a narrow band 
signal should not be too large, apparently in order to preserve time 
structure after the frequency analysis by the basilar membrane. 

Another argument might be that only variations in the time structure 
occur as phases in a narrow band signal are changed; the power spec- 
trum is not affected. This argument, however, is only valid if the 
signal transformations which occur in the hearing system are linear. 
Investigations by many authors (Zwicker, 1955; Plomp, 1965; Goldstein, 
1967b) indicate that these transformations do have non-linear proper- 
ties. Viz., a two-component signal f ^ ,f 2 evokes frequency components 
of the form (n+l)fj-nf 2 (nM , integer) , which are commonly known as 
combination tones (CT*s). 




345 



Buunen and Bilsen: PHASE EFFECTS 



We suppose that these non-linearities might play a role in the 
detection of phase effects. For example, in a three component signal 
consisting of the frequencies f^-Af,f^ and f^+Af it can be easily seen 
that the first order CT (n=l) generated by f^ and the fre- 

quency f^-Af. So there will be an interaction of this CT and the low- 
est acoustical frequency component. The result of this interference 
will be phase dependent. From this we arrive at the hypothesis that 
phase effects may also be due to this type of interaction. The experi- 
ments reported below are meant to test this hypothesis. 

II SOME INDICATIVE EXPERIMENTS. 

In Fig. la the phase dependency of the resultant of the first- 
order CT and the lowest acoustical component in a three-component 
signal has been schematized. Until further notice, we will ignore CT*s 
of higher order (n_^2) . Then, in our conception, phase effects should 
disappear when the CT is cancelled. 

In a first experiment this is tested in the following way. Three 
components, f^-Af,f^ and f^+Af, were generated by three independent 
sources. Listening to this signal a subject can adjust f^ until he 
hears a very distinct beating sound. Then a component is added to the 
signal with a frequency of 2f^-(f^+Af). We derived this component from 
f and f +Af by means of a non-linear diode network followed by a 
narrow bandpass filter tuned At the desired frequency. Now, the sub- 
ject is asked to adjust phase and amplitude of this component that the 
beating sensation disappears. This can only be done if the frequency 
difference Af is small enough to allow phase effects and, on the other 
hand, sufficiently large to allow only the first-order CT of the two 
highest components to be detectable. If these phase effects would 
originate from time structure detection, it should be impossible to 
extinct them in this way because the time pattern of the acoustical 
signal is still changing after the cancellation procedure. (At fre- 
quency f^“Af there is a beating acoustical component.) Explanation 
of this effect can only be given if these phase effects originate 
from the CT. 




346 



Buunen and Bilsen: PHASE EFFECTS 



ac. 



Fig.l. Schematic representation of 
the "internal spectrum". An 
internal component (i.c.) 
is the resultant of the 
acoustical component (a.c.) 
and the combination tone 
(c.t.) of the same frequen- 
cy. 

A second experiment is also illustrative in this respect. The 
three components are not freely running now, but generated by divi- 
ding a common clock frequency. In that case each particular phase 
relation of the components can be investigated. A subject which is 
manipulating the phase of the lowest component (f^-Af) appears to be 
able to choose such a phase that a reversal of this particular phase 
is not perceptible. At all other phases a change of 180° is clearly 
perceptible. The explanation of this will be clear after inspection 
of Fig. lb. 

A third experiment to demonstrate the validity of our hypothesis 
can be performed as follows. The subject adjusts phase and amplitude 
of the component (f^-Af) until no difference is heard between a two 
component signal f^,f^+Af and the three-component signal (phase- 
locked). Thus, the indiscriminability of a two-component signal and 
a three-component signal with specific phases and amplitudes can be 
explained by as summing that the resultant of- the acoustical component 
and the CT in the three-component signal is just equal to the level 
of the CT alone in the two-component signal. This has been schemati- 
zed in Fig. Ic. 






347 



Buunen and Bilsen; PHASE EFFECTS 



These experiments indicate that it is necessary to consider the 
"internal spectrum" with its "internal" components rather than the 
(external) spectrum of the acoustical signal. For the signals consi- 
dered! subjective phase effects seem to be correlated with changes in 
the internal amplitude spectrum. Phases of the internal components 
are irrelevant. 

Ill THE EXISTENCE REGION 

As mentioned in the introduction, phase effects only occur if 
the frequency separation between the components is not too large. In 
the concept of time structure detection this has been interpreted as 
a requirement to permit a complex time structure to be detected after 
the rather limited frequency analysis by the basilar membrane 
(Zwicker, 1^52; Goldstein, 1967a). In view of our alternative hypo- 
thesis, we would point to a possible relation with the result of 
Zwicker (1955) who finds that CT’s occur only if the frequency sepa- 
ration of the generating components is not too large, de Boer (1956) 
has given a method to determine the maximal frequency separation for 
phase effects to be perceptible. Smoorenburg (1972a) has measured the 
"audibility region" of CT’s. These two methods have been combined to 
determine a correlation between the frequency separation for just 
audible phase effects and for just audible CT’s. 

A three component signal f -Af,f’ and f +Af was used. The 

car car car 

highest and the lowest component were generated by analogue multi- 
plication of a frequency f and Af.A component f’ =f +2Hz was 

car car car 

added. Listening to this signal one perceives a beating sound with 
a beat frequency of 4 Hz, if Af is not too large. The amplitudes of 
the three components were equal, about 30 dB SL. The subject was 
asked to diminish the frequency Af until the beats were just percep- 
tible. Thereafter, the signal was replaced by a two-component signal 
of equal amplitudes and frequencies and f^^+Af. The amplitudes 

were the same as in the previous experiment. Now the subject was 

asked to lower f +Af until he heard the CT f -Af going up in pitch, 
car car 

The frequency Af was registered. This was done for several values of 

f . The results of three subjects are shown in Fig. 2. 
car 




348 



Buunen and Bilsen: PHASE EFFECTS 



Each measured point is the average of at least 4 and at most 6 adjust- 
ments. It can be seen that the maximum frequency separations Af in 
both experiments correspond rather well. This supports our hypothesis 
on the influence of CT’s on subjective phase effects. 

The left figure on the bottom of Fig. 2 gives a comparison of 
our results with those of other authors. The solid curve in this 
figure is the "audibility region" as measured by Smoorenburg (1972a). 

The curve marked FB represents the maximal frequency separation 
for phase susceptibility in broad-band signals as measured by Bilsen 
(1968). Thereby, the criterion was the detection of any subjective 
differende between low-pass filtered periodic pulses and periodic 
noise. The cutoff frequency of the filter is equivalent to The 

dotted curve in the same figure gives the highest modulation fre- 
quency (=frequency separation) for which a difference between 100%-AM 
and QFM can be perceived, as measured by Goldstein (1967a). All curves 
correspond reasonably well to our measured points. 



800 





J.R. 






) o 








) t 





















Hz 

8001 



200 





TE. 






0 




p 




! f 


8 






* 

< 




1 











Fig. 2. Boundary for subjective phase effects (open circles) and audi- 
bility region for combination tones (filled circles). Curves 
in the lower left figure indicate results obtained by Smooren- 
burg (1972a), Goldstein (1967a) and Bilsen (1968). 




349 



Buunen and Bilsen; PHASE EFFECTS 
IV THE INTERNAL SPECTRUM 

A very direct method to obtain information about the internal 
spectrum is by means of a cancellation method. If an interaction 
between the lowest component and a CT in a three-component signal 
exists fa cancellation of the resultant by an extra acoustical compo- 
nent should demonstrate this. The cancellation method has been des- 
cribed in detail by Zwicker (1955) and Goldstein (1967b), so we omit 
it here. 

A signal consisting of 1800,2000 and 2200 Hz was used (Buunen et 
al., 1974). The amplitudes of 2000 and 2200 Hz were equal and about 
40 dB SL. The amplitude of the 1800 Hz-component was 10 dB lower. The 
components were phase locked and their phases could be controlled 
manually. Only the phase of the 2000 Hz component was varied, the 
amount of variation out of cosine phase being denoted by 0. The equi- 
valent levels!, of the internal components of 1800, 1600 and 1400 Hz 
were measured by means of cancellation as a function of 0. Results 
for three subjects are given in Fig. 3. A dependence of the internal 
spectrum upon the phase 0 is demonstrated. 




Fig. 3. Results of cancellation experiments. Each point represents one 

cancellation. Arrows indicate that the internal component is near 
the threshold of hearing. 




350 



Buunen and Bilsen: PHASE EFFECTS 



An objection against cancellation measurements in general might 
be the interaction between the signal that has to be measured and the 
cancellation signal (Goldstein, 1967b ; Smoorenburg, 1972b). This 
objection has been overcome by estimating the internal spectrum by 
way of a foreward-masking experiment. The envelope of the signals used 
in this case is shown in Fig. 4. 




Fig. 4. The stimulus con- 
figuration of the 
masking experiment. 
Masker: three-com- 
ponent signal. Pro- 



be: pure tone. 



The "masker" is the same three- component signal as in the cancellation 
measurements. The "probe" is a pure tone, switched with a sinusoidal 
envelope. The amount of threshold increase, AL, of this probe tone for 
different values of 0 has been determined. Results are shown in Fig. 5 
Essentially, they are the same as the results of the cancellation. 





Fig. 5. Results of the masking 
experiment. AL is the 
threshold increment of 
the probe signal. 





351 



Buunen and Bilsen: PHASE EFFECTS 



DISCUSSION 

In literature, subjective phase effects have often been ascribed 
to the detection of changes in the temporal structure of a signal (see 
Plomp,(1970) for an extensive review on timbre, in general, and phase 
effects in particular). Indeed, for a certain category of signals only 
the temporal structure can account for differences in perception as 
phases are changed. For example, a periodic impulse and periodic noise 
(maximum length sequence) have equal power spectra but markedly dif- 
ferent time structures. It is not astonishing that, for a repetition 
frequency of say 1 Hz, both signals are perceptually quite different 
(Bilsen, 1968). Also the class of stimuli investigated by Terhardt 
(1968), which gives rise to a sensation of roughness, indicates the 
importance of temporal (envelope) detection. 

Particularly, de Boer’s phase rule seems to be in favour of 
temporal detection (de Boer, 1961). This rule states that the timbre 
of a sound does not change, when the phases of the components are 
shifted by a constant amount and/or amounts that are linearly depen- 
dent on frequency. A mathematical analysis readily shows that, under 
these conditions, the temporal envelope of the signal is invariant. 

For low envelope periodicities, undoubtedly this has a direct bearing 
on perception. For higher periodicities (=larger frequency separa- 
tion), however, an alternative reasoning based on CT interaction 
applies. Buunen et al. (1974) derived mathematically that the phase 
angle between a CT and an acoustical component of the same frequency, 
and thus the resultant , remains unchanged if phase changes in the 
acoustical components obey de Boer’s phase trule. This is in accor- 
dance with the result that phase effects are absent if the internal 
spectrum remains unchanged. 

Another interesting phase effect, viz. the dependence of the 
prominence of residue pitch on phase (Mathes and Miller, 1947; de 
Boer, 1956)jis understood in view of our conception of internal 
spectrum. For the three component signal of section IV, it has been 
measured and explained that the percentage of subject’s responses 
corresponding to residue pitch, is minimal when the internal com- 




352 



Buunen and Bilsen: PHASE EFFECTS 



ponents below 2000 Hz, which dominate with respect to pitch, have mi- 
nimal amplitude (Buunen et al., 1974). 

In conclusion, we have shown that a class of phase effects exists, 
that finds its cause in the interaction of CT’s and acoustical compo- 
nents of a signal. These phase effects are perceptible as long as CT’s 
are perceptible (section III), They are absent as long as the internal 
spectrum remains unchanged (section II and IV) . Future investigations 
will be concentrated on making a distinction between phase effects 
caused by CT interaction and those caused by time structure detection. 



REFERENCES 

Bilsen, F.A. (1968), On the interaction of a sound with its repetition. 
Doctoral Dissertation, Univ. of Delft. 

Boer, E. de (1956), On the residue in hearing. Doctoral Dissertation, 

Univ. of Amsterdam. 

Boer, E. de (1961), A note on phase distortion and hearing, Acustica 1 1 , 
182-184. 

Buunen, T.J.F., J.M. Festen, F.A. Bilsen and G. v.d. Brink (1974), 

Phase effects in a three-component signal, J. Acoust .Soc.Am. 55 , 
297-303. 

Goldstein, J.L. (1967a), Auditory spectral filtering and monaural phase 
perception, J. Acoust .Soc .Am, 458-479. 

Goldstein, J.L. (1967b), Auditory nonlinearity, J. Acoust. Soc.Am. 41 , 
676-689. 

Houtsma, A.H.., and J.L. Goldstein (1972), The central origin of the 

pitch of complex tones; Evidence from musical interval recognition. 
J. Acoust .Soc.Am. _5J[, 520-529. 

Mathes, R.C. and R.L. Miller (1947), Phase effects in monaural perception, 
J. Acoust .Soc.Am. J_9, 780-797. 

Plomp, R. (1965), Detectability threshold for combination tones, 

J. Acoust .Soc.Am. 32, 1110-1123. 

Plomp, R. (1970), Timbre as a multidimensional attribute of complex tones, 
in Frequency Analysis and Periodicity Detection in Hearing, Plomp 
and Smoorenburg Eds, Sythoff, Leiden p. 397-411. 

Smoorenburg, G.F. (1972a), Audibility region of combination tones, 

J. Acoust .Soc.Am. _52, 603-614. 

Smoorenburg, G.F. (1972b), Combination tones and their origin, 

J. Acoust . Soc.Am. _52, 615-632. 

Schouten, J.F., R.J. Ritsma and B.L. Cardozo (1962), Pitch of the residue, 
J. Acoust .Soc.Am. _3i, 1418-1424. 

Terhardt, E. (1968), Uber die durch amplitudenmodulierte Sinustone her- 
vorgerufene Empfindung, Acustica 20, 210-214. 

Terhardt, E. (1972), Zur Tonhbhenwahrnehmung von Klangen I. Psycho-akusti- 
sche Grundlagen, Acustica 26, 173-186. 

Zwicker, E. (1952), Die Grenzen der Horbarkeit der Amplituden-modulation 
und der Frequenzmodulation eines Tones, Acustica 2, 125-133. 

Zwicker, E. (1955), Der ungewbhnliche Amplitudengang der nichtlinearen 
Verzerrungen des Ohres, Acustica 2, 67-74 » 




353 



PITCH OF PURE TONES: ITS RELATION TO INTENSITY 
E. TERHARDT 

Institut fiir Elektroakustik der Techm'schen Universitat MUnchen, FRG 



Introduction 

Compared with the tonal sounds of daily life (e.g. the human voice and the 
sounds of music), pure tones may be considered as to be of minor relevance. 
However, recently it is recognized that the pitch of complex signals seems 
to be based principally on spectral cues (in contrast to temporal ones). 
Thus, pure tones, and the pitch produced by them, again become significant 
and interesting. 

One of the phenomena which can be observed with pure tones is the influence 
of sound pressure level (SPL) on their pitch, i.e. the pitch-intensity 
effect. The data of Zurmuhl (1930), Vierling (1934), Stevens (1935), Snow 
(1936), and Walliser (1969) are rather well in line with each other, show- 
ing that, by increasing intensity, low tones become slightly lower, high 
tones slightly higher. 

However, there still exist doubts on whether the phenomenon is just an 
artifact, caused by unsuitable methods or/and selection of subjects. These 
doubts have been articulated and experimentally strengthened, in particu- 
lar, by Cohen (1961). Therefore, careful pitch-matching experiments have 
been carried out. 

1. Method and procedure 

Two pure tones were produced by tone generators with less than 0.1% non- 
linear distortion. The sound-pressure levels of both tones and the frequen- 
cy of one of them was controlled by the operator, the frequency of the se- 
cond tone was adjusted by the subject, situated in a sound-proof cabin. By 
means of an electronic switching device, both tones were presented alter- 
nately with a duration of 350 msec and 200 msec pause. The switches were 
followed by band-pass filters (1/3 octave) in order to avoid clicks. The 
tones were presented to the subject through earphones (Beyer DT 48). 




354 



Terhardt: PITCH-INTENSITY EFFECT 

In a first experimentj presentation was monotic, i.e., both alternating 
tones were presented to the same ear. In a second experiment, dichotic 
presentation was employed, i.e., the first tone to one ear and the second 
tone to the other onet^ In both experiments the subject's task was to ad- 
just one tone's frequency so that the pitches of both tones were equal as 
well as possible. 

One of the tones, the "40 dB-tone", was consistently set at the SPL 40 dB 
(re 2-10^ N/m^). The other one, the "SPL-tone", was set at either 40, 60 or 
80 dB. In order to eliminate artifacts, each tone combination was presented 
in two different ways: (1) The SPL-tone was preset in frequency by the ope- 
rator and the 40 dB-tone was adjusted by the subject to give the same pitch; 
(2) the 40 dB-tone was preset in frequency and the subject adjusted the 
SPL-tone. 

The result of each adjustment is expressed by the value 

^40dB~^SPL (1) 

f^p^ equal pitch 

^^40dB“ 'the 40 dB-tone; f^p^= frequency of the SPL-tone). 

In the monotic experiment, the adjustments were made at the frequencies 
200 Hz, 1 kHz, and 4 kHz. In the dichotic experiment, in addition, the fre- 
quency 6 kHz was used. Since the effect of binaural diplacusis in normal - 
hearing subjects depends sharply on frequency (see, e.g. van den Brink, 
1970), the nominal frequencies (as preset by the operator) were realized 
with an error of less than + 1 Hz. 

15 subjects (co-workers of the institute, and students; 13 male, 2 female) 
with normal thresholds of hearing participated in the experiments. In each 
of both experiments, at each frequency and SPL at least four adjustments 
were made by each subject. In addition, by eight subjects the entire mono- 
tic experiment was performed twice, i.e. once with each ear. 

2. Results 

The results are described in terms of the "pitch shift" v (Eq.l) as a 
function of the SPL-difference between both tones, AL. A positive 




355 



Terhardt: PITCH-INTENSITY EFFECT 



v-value at a particular AL indicates that (and how much) the pitch of a 
tone with the SPL 40 dB +AL is higher than that of a 40 dB-tone with the 
same frequency. Likewise, a negative v-value indicates a lower pitch. 

With respect to the question whether the pitch-intensity effect is a fact 
or artifact, first the individual results of each subject were considered 
separately. 



2.1 Monotic experiment, individual results 

Fig.l depicts the median individual results, labelled by the numbers 1 to 
15. Each square of the entire diagram contains two "pitch-shift contours" 
of the same subject, the upper one corresponding to 200 Hz, the lower one 
to 4 kHz. The contours are the result of connecting the median points by 
straight lines. Full lines indicate monotic presentation to the left ear, 
dashed lines correspond to right-ear presentation. 




Fig.l. Pitch shift v (ordinate), produced by increasing the SPL 
by AL (abscissa), monotic experiment. Numbers 1...15 correspond 
to subjects. Upper half of each square: tone frequency 200 Hz; 
lower half : 4 kHz . The lines connect medians of at least 4 ad- 
justments. Full lines: left ear; dashed lines: right ear. 




356 



Terhardt: PITCH-INTENSITY EFFECT 

At the frequency 1 kHz the pitch-intensity effect was very small. There- 
fore the individual medians provide little additional information and have 
not been displayed. 

Fig.l reveals: 

(1) At AL = 0, the v-values are less than + 1% for all subjects and fre- 
quencies. Hence, the "pitch-matching error" in this experiment was 
maximally about one JND of frequency. 

(2) At f = 200 Hz, the pitch-intensity effect qualitatively is the same 
for all subjects: An increase of intensity causes a slight decrease 
of pitch. Quantitatively, the effect is different for different sub- 
jects (compare, e.g. No's 1 and 2). 

(3) At f = 4 kHz, most subjects indicate an increase of pitch with in- 
creasing intensity. For some subjects, however, the effect at 4 kHz 
is practically zero (e.g. No's 11, 14, 15). 

2.2 Dichotic experiment, individual results 

When both tones were presented alternately to both ears of the subject, 
two complications were recognized. 

First, it was more difficult than in the monotic case to match the pitches 
of both tones even in the case AL = 0. This is due to the phenomenon that, 
in dichotic presentation, both tones are sounding somehow different even 
though their pitches may be the same as well as possible. Therefore the 
"pitch-matching error" may become almost as large as the effect to be 
measured. This is the reason why the frequency 6 kHz was additionally 
chosen since the pitch-intensity effect was expected to be larger at higher 
frequencies. 

Second, binaural diplacusis causes additional systematic "pitch shifts" 
which have to be separated from the pitch-intensity effect. Fig. 2 shows 
typical results of one subject. The straight lines connect the medians of 
four adjustments. Full lines indicate that the 40 dB-tone was presented to 
the left ear (and the SPL-tone to the right one), dashed lines correspond 
to the reversed case. Clearly, the different vertical positions of the 
full and dashed lines, respectively, reveal the effect of diplacusis. 




357 



Terhardt: PITCH-INTENSITY EFFECT 




Fig. 2. Pitch shift contours 
of one subject^- dichotic ex- 
periment. Lines connect me- 
dians of at least 4 adjust- 
ments. Full lines: 40 dB- 
tone at the left ear; dash- 
ed lines : 40 dB-tone at the 
right ear. 



whereas the slope of the contours as a function of dL represents the 
pitch-intensity effect. The effect of binaural diplacusis can be elimi- 
nated simply by taking the medians of all adjustments from both kinds of 
dichotic presentation. 

When this has been done, the individual results of the 15 subjects reveal 
the same tendencies as in the monotic experiment: Qualitatively, i.e. 
with respect to its direction, the pitch-intensity effect is the same for 
all subjects; quantitatively, the amount of pitch shift is different for 
different subjects. Those subjects who indicated a strong effect in the 
monotic experiment do so in the dichotic one, too. Likewise, the subjects 
measuring small effects only did so in both experiments. 



2.3 General results of both experiments 

Keeping in mind these particularities, it appears reasonable to depict 
the pitch-intensity effect by the medians of the adjustments of all sub- 
jects. These are shown in Fig. 3. Squares correspond to the monotic experi- 
ment, circles to the dichotic one. Vertical bars indicate interquartiles. 
Every point is based on at least (15+8)4 = 92 (monotic experiment), and 
15*4 = 60 (dichotic experiment) adjustments, respectively. 

Since the results of both experiments agree well, the pitch-intensity 
effect as revealed by these experiments can be represented with good appro- 
ximation by the straight lines in Fig. 3. 




358 



Terhardt: PITCH-INTENSITY EFFECT 




Fig. 3. Median results of all 
subjects. Squares: monotic ex- 
periment; circles: dichotic 
experiment. Vertical bars: 
interquart iles . 

m'ficance for hearing theory may become 
be regarded in this respect are the fol 



3. Discussion 

The present results are well in 
line with those of previous inves- 
tigations (Stevens, 1935; Snow, 
1936; Walliser, 1969). This con- 
sistency of the phenomenon and, in 
addition the consistency revealed 
by the dichotic experiment^Wke it 
highly probable that the pitch- 
intensity effect is not an artifact 
but a fact. It should be recognized 
that Cohen's (1961) results do not 
disprove the phenomenon but just 
are not suitable to prove it (as 
Cohen concluded himself). This 
principally is due to the relative- 
ly large pitch-matching error in- 
duced by Cohen's particular method. 

With respect to the question of how 
the auditory system detects the 
pitch of pure tones, the pitch-in- 
tensity effect as such appears as 
not very conclusive. However, if 
the phenomenon can be consistently 
related to other phenomena its sig- 
considerable. Phenomena which should 
owing ones. 



(1) The SPL-dependence of the shape of pure-tone thresholds masked by 
narrow-band maskers (see, e.g. Zwicker and Feldtkeller, 1967). These 
masked thresholds suggest that the principal excitation of the ear 
produced by a pure tone does shift slightly toward a region which cor- 
responds to higher frequencies if the SPL is increased. Hence, if the 
position of principal excitation along the cochlear partition would be 
a determining cue of pure-tone pitch, the nonlinearity of masked thres- 
holds would suggest what actually is observed at high frequencies : 




359 



Terhardt: PITCH-INTENSITY EFFECT 

A sharpening of pitch with increasing SPL. At low frequencies, where 
the pitch-intensity effect is negative, however, this relation does 
not hold. As has been discussed already (Terhardt, 1972), at low fre- 
quencies another effect may become dominant: 

(2) The absolute threshold of hearing at low frequencies can be considered 
as some kind of a "masked threshold" where the masker is an internal 
noise whose intensity increases with decreasing frequency, i.e., a 
"low-pass noise" (Zwicker, 1958). As has been shown by several authors 
(e.g. Egan and Meyer, 1950; Terhardt and Fasti, 1971), the pitch of 

a pure tone in the frequency region just above a (real) low-pass noise 
is risen. This can be ascribed to partial masking of the tone by the 
low-pass noise (Egan and Meyer, 1950; Terhardt, 1972). Hence, if the 
tone's SPL is increased, the effect of partial masking decreases and 
pitch shifts downward. This is the mechanism which may account for the 
negative pitch-intensity effect at low frequencies. 

(3) Responses of single units of the acoustic nerve as a function of the 
stimulating tone's frequency and SPL reveal a nonlinearity very simi- 
lar to that one mentioned above (1). The spike-rate-frequency pattern 
of a single unit as obtained by, e.g. Rose et al . (1971) exhibits a 
distinct maximum which shifts slightly toward lower frequencies with 
increasing SPL of the stimulus. This indicates that a tone with fixed 
frequency stimulates at higher SPL's those fibers dominantly which 
correspond to higher frequencies. The nonlinearity which has been found 
by Rhode (1971) in the frequency response of basilar-membrane displace- 
ment represents another effect with the same consequence: Maximal dis- 
placement at a fixed place of the basilar membrane is attained by 
slightly lower tone frequencies at high SPL's than at low SPL's. 

At present, these correlations between the pitch-intensity effect and other 
psychophysical and physiological phenomena appear still rather vague. More 
related data, in particular from the physiological side, v/ould be valuable. 
Thus, the theoretical implications of the pitch-intensity effect still are 
difficult to reckon. The phenomenon itself obviously is a fact. 




360 



Terhardt: PITCH-INTENSITY EFFECT 



Acknowledgments . The author is indebted to H. Schiitte and J. Kapser for 
their activities in executing the experiments. This research was carried 
out in the Sonderforschungsbereich Kybernetik, Miinchen , supported by the 
Deutsche Forschungsgemeinschaft . 

**^The dichotic procedure has been employed already in 1953 by Ward with 5 
S’s, at 250 Hz (cf. "Foundations of Modern Auditory Theory", J.V. Tobias, 
Ed., Vol.l, Academic Press, New York, 1970, pp. 426). His results agree 
well with the present ones at 200 Hz. 



References 

van den Brink, G. (1970). "Experiments on Binaural Diplacusis and Tone 

Perception," in: "Frequency Analysis and Periodicity Detection 
in Hearing," Plomp, R. and Smoorenburg, G.F. (Eds.), Sijthoff, 
Leiden, pp. 362-372. 

Cohen, A. (1961). "Further Investigation of the Effects of Intensity upon 
the Pitch of Pure Tones," J. Acoust. Soc. Amer. 33, 1363-1376. 

Egan, J.P., and Meyer, D.R. (1950). "Changes in Pitch of Tones of Low Fre- 
quency as a Function of the Pattern of Excitation Produced by 
a Band of Noise," J. Acoust. Soc. Amer. 22, 827-833. 

Rhode, W.S. (1971). "Observations of the Vibration of the ‘Basilar Membrane 
in Squirrel Monkeys using the Mossbauer Technique,"!. Acoust. 
Soc. Amer. 49, 1218-1231. 

Rose, J.E., Hind, J.E. , Anderson, D. J. , and Brugge, J.F. (1971). "Some 

Effects of Stimulus Intensity on Response of Auditory Nerve 
Fibers in the Squirrel Monkey,"!. Neurophysiol. 34, 685-699. 

Snow, W.B. (1936). "Change of Pitch with Loudness at Low Frequencies," 

!. Acoust. Soc. Amer. 8, 14-19. 

Stevens, S.S. (1935). "The Relation of Pitch to Intensity," !. Acoust. Soc. 
Amer. 6, 150-154. 

Terhardt, E. C1972). "Zur Tonhohenwahrnehmung von Klangen. II. Ein Funk- 
tionsschema," Acustica 26, 187-199. 

Terhardt, E., and Fasti, H. (1971). "Zum EinfluB von Stortonen und Storge- 
rauschen auf die Tonhohe von Sinustonen," Acustica 25, 53-61. 

Vierling, 0.^(1934), "Der EinfluB der Lautstarke auf die Tonho henempf indung, " 
*Z, f. techn. Physik 15, 641. 

Walliser, K. (1969), "Ober die Abhangigkeiten der Tonho henempf indung von Si- 
nustonen vom Schallpegel, von iiberlagertem drosselndem Stor- 
schall und von der Darbietungsdauer , " Acustica 21, 211-221. 

Zurmiihl, G. (1930). "Abhangigkeit der Tonhohenempf indung von der Lautstarke 
und ihre Beziehungen zur Helmholtzschen Resonanztheorie des 
Horens," Z. Sinnesphysiol . 61, 40-86. 

Zwicker, E. (1958). "Ober psychologische und methodische Grundlagen der 
Lautheit," Acustica 8, 237-258. 

Zwicker, E., and Feldtkeller, R. (1967). Das Ohr als Nachrichtenempfanger 
(Hirzel, Stuttgart, 2nd Ed.). 




Kommunikation und Kybernetik 
in Einzeldarstellungen 

Herausgegeben von H. Wolter und W.D. Keidel 



1. Band: 



2. Band: 



3. Band: 



4. Band: 



5. Band: 



6. Band: 



7. Band: 




Grundlagen und Anwendungen der Informationstheorie 

Von W. Meyer- Eppler 

2. Auflage, neu bearbeitet und erweitert von G. Heike und K. Lohn 
Mit 205 Abbildungen und 1 Tafel. XXVII, 549 Seiten. 1969 
Gebunden DM 108,-; US $44.10. ISBN 3-540-04583-X 

Structural Linguistics and Human Communication 

An Introduction into the Mechanism of Language and the 
Methodology of Linguistics. 

By B. Malmberg 

2nd revised edition. With 88 figures. VIII, 213 pages.1967 
Cloth DM 44,-; US $18.00. ISBN 3-540-03888-4 

Speech Analysis / Synthesis and Perception 

ByJ.L. Flanagan 

2nd edition. With 258 figures. XI, 444 pages. 1972 
Cloth DM 86,-; US $35.10. ISBN 3-540-05561-4 

The Advanced Theory of Language as Choice and Chance 

By G. Herdan 

With 30 figures. XVI, 459 pages. 1966 

Cloth DM 69,-; US $28.20. ISBN 3-540-03584-2 

Linguistische Einheiten im Rahmen der modernen 

Sprachwissenschaft 

Von G. Hammarstrom 

Mit 5 Abbildungen. VIII, 109 Seiten. 1966 

Gebunden DM 29,-; US $1 1.90. ISBN 3-540-03585-0 

Einfiihrung in die allgemeine informationstheorie 

Von J. Peters 

Mit 75 Abbildungen. XII, 266 Seiten. 1967 
Gebunden DM 70,-; US $28.60. ISBN 3-540-03889-2 

The Measurement of Verbal Information in Psychology 
and Education 

By K. Weltner 

Translated from the German by B.M. Crook 

With 82 figures. XIII, 185 pages. 1973 

Cloth DM 58,-; US $23.70. ISBN 3-540-06335-8 

Preisanderungen vorbehalten 

Springer-Verlag Berlin Heidelberg New York 

Munchen Johannesburg London Madrid New Delhi Paris 
Rio de Janeiro Sydney Tokyo Utrecht Wien 



Handbook of Sensory 
Physiology 

Editorial Board: H. Autrum, R. Jung, W.R. Loewenstein 
D.M. MacKay, H.L. Teuber 



Voi. 1 : Principles of Receptor 
Physiology 

Editor: W.R. Loewenstein 
262 figures. XII, 600 pages. 1971 
Cloth DM 168,-; US $68.60 
Subscription price 
Cloth DM 134,40; US $54.90 
ISBN 3-540-05144-9 

Vol. 2: Somatosensory System 

Editor: A. Iggo. 240 figures 

XI, 851 pages. 1973 

Cloth DM 262,-; US $106.90 

Subscription price 

Cloth DM 209,60; US $85.60 

ISBN 3-540-05941-5 

Vol. 3: Part 1 : Enteroceptors 
Editor: E. Neil. 91 figures 
VIII, 233 pages. 1972 
Cloth DM 96,-; US $39.20 
Subscription price 
Cloth DM 76,80; US $31.40 
ISBN 3-540-05523-1 

Part 2: Muscle Receptors 
Editor: C.C. Hunt. In preparation 

Part 3: Electroreceptors 
Editor: A. Fessard. In preparation 

Vol. 4: Chemical Senses 

Part 1 : Olfaction 

Editor: L.M. Beidler. 212 figures 

VIII, 518 pages. 1971 

Cloth DM 158,-; US $64.50 

Subscription price 

Cloth DM 126,40; US $51.60 

ISBN 3-540-05291-7 

Part 2: Taste 
Editor: L.M. Beidler 
1 76 figures. VI 1 1, 41 0 pages 
1971. Cloth DM 144,- 
US $58.80. Subscription price 
Cloth DM 1 1 5,20; US $47.00 
ISBN 3-540-05501-0 

Vol. 5: Auditory System 
Part 1 : Anatomy and 
Physiology 
Editors: W.D. Keidel, 

W.D. Neff. Approx. 800 pages 
1974. In preparation 
ISBN 3-540-06675-4 

Part 2: In preparation 



Vol. 6: Vestibular System 
In two parts 
Editor: H.H. Kornhuber 
In preparation 

Vol. 7 

Part 1 : Photochemistry 
of Vision 

Editor: H.J.A. Dartnall 

296 figures. XII, 810 pages. 1972 
Cloth DM 238,-; US $97.10 
Subscription price 

Cloth DM 190,40; US $77.70 
ISBN 3-540-05145-7 

Part 2: Physiology of Photo- 
receptor Organs 
Editor: M.G.F. Fuortes 
342 figures. X, 765 pages. 1972 
Cloth DM 244,-; US $99.60 
Subscription price 
Cloth DM 195,20; US $79.70 
ISBN 3-540-05743-9 

Part 3: Central Processing 
of Visual Information 
Editor: R. Jung 
A: Integrative Functions and 
Comparative Data. 208 figures 
XI, 775 pages. 1973 
Cloth DM 248,-; US $101.20 
Subscription price 
Cloth DM 198,40; US $81.00 
ISBN 3-540-05769-2 
B: Visual Centers in the Brain 
216 figures. VIII, 738 pages 
1973. Cloth DM 248,- 
US $1 01 .20; Subscription price 
Cloth DM 198,40; US $81.00 
ISBN 3-540-06056-1 

Part 4: Visual Psychophysics 
Editors: D. Jameson, 

L.M. Hurvich 

297 figures. X, 81 2 pages. 1 972 
Cloth DM 248,-; US $101.20 
Subscription price 

Cloth DM 198,40; US $81.00 
ISBN 3-540-05146-5 

Vol. 8: Perception 
Editors: R. Held, 

H.W. Leibowitz, H.-L. Teuber 
In preparation 

Voi. 9: Development 
of Sensory Systems 
In preparation 



W.F. Ganong 

Lehrbuch 

der Medizinischen 

Physiologie 

Die Physiologie des Menschen 
fur Studierende der Medizin 
und Arzte. Obersetzt, bearbeitet 
und erganzt von W. Auerswald 
in Zusammenarbeit mit B. Bindei^ 
A. Haidenthaler, J. MIczoch 

3. vollig neubearbeitete und 
erweiterte Auflage 
545 Abb., 150 Tab., 1 Anhang 
XX, 811 Seiten. 1974 
DM 38,-; US$15.50 
ISBN 3-540-06440-0 



M. Schneider 

Einfuhrung 
in die Physiologie 
des Menschen 

Begriindet 1936 von H. Rein 
Korr. Nachdruck der 16. neu- 
bearbeiteten Auflage 
493 Abb. XV, 604 Seiten. 1973 
Gebunden DM 59,60; US $24.40 
ISBN 3-540-06356-0 

Preisanderungen vorbehalten 
Prices are subject to change 
without notice 




Springer-Verlag 
Berlin 
Heidelberg 
New York 

Munchen Johannesburg London 
Madrid New Delhi Paris 
Rio de Janeiro Sydney Tokyo 
Utrecht Wien 




